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SUPPRESSORS OF CYTOKINE SIGNALING: RELATED REAGENTS 

5 

This filing is a PCT Application claiming priority 
to provisional U.S. Patent Applications USSN 60/055,804, 
filed August 15, 1997, and USSN 60/053,153, filed July 
18, 1997. Also incorporated by reference are provisional 
10 U.S. Patent Applications USSN 60/055,853, filed August 
15, 1997, and USSN 60/053,244, filed July 18, 1997. 

FIELD OF THE INVENTION 
The present invention pertains to compositions 
15 related to proteins which function, e.g., in suppressing 
intracellular signaling pathways, e.g., cytokine 
signaling. In particular, it provides purified genes, 
proteins, antibodies, and related reagents useful, e.g., 
to regulate growth hormone-like or cytokine-regulated 
20 intracellular processes, including transcription or genes 
in various cell types, including immune cells. 

BACKGROUND OF THE INVENTION 

Recombinant DNA technology refers generally to the 

25 technique of integrating genetic information from a donor 
source into vectors for subsequent processing, such as 
through introduction into a host, whereby the transferred 
genetic information is copied and/or expressed in the new 
environment. Commonly, the genetic information exists in 

30 the form of complementary DNA (cDNA) derived from 
messenger RNA (mRNA) coding for a desired protein 
product. The carrier is frequently a plasmid having the 
capacity to incorporate cDNA for later replication in a 
host and, in some cases, actually to control expression 

35 of the cDNA and thereby direct synthesis of the encoded 
product in the host. 

For some time, it has been known that the mammalian 
immune response is based on a series of complex cellular 
interactions, called the "immune network". Recent 

40 research has provided new insights into the inner 
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workings of this network. While it remains clear that 
much of the response does, in fact, revolve around the 
network-like interactions of lymphocytes, macrophages, 
granulocytes, and other cells, immunologists now 
5 generally hold the opinion that soluble proteins, known 
as lymphokines, cytokines, or monokines, play a critical 
role in controlling these cellular interactions. Thus, 
there is considerable interest in the isolation, 
characterization, and mechanisms of action of cell 

10 modulatory factors, an understanding of which will lead 
to significant advancements in the diagnosis and therapy 
of numerous medical abnormalities, e.g., immune system 
disorders. Some of these factors are hematopoietic 
growth factors, e.g., granulocyte colony stimulating 

15 factor (G-CSF) , and others are regulatory molecules. 

See, e.g., Thomson (1994; ed. ) The Cytokine Handbook (2d 

ed.) Academic Press, San Diego; Metcalf and Nicola (1995) 
The Hematopoietic Colony Stimulating Factors Cambridge 
University Press; and Aggarwal and Gutterman (1991) Human 
20 Cytokines Blackwell Pub. 

Lymphokines apparently mediate cellular activities 
in a variety of ways. They have been shown to support 
the proliferation, growth, and differentiation of, e.g., 
pluripotential hematopoietic stem cells into vast numbers 

25 of progenitors comprising diverse cellular lineages 

making up a complex immune system. Proper and balanced 
interactions between cellular components are necessary 
for a healthy developmental or immune response. The 
different cellular lineages often respond in a different 

30 manner when lymphokines are administered in conjunction 
with other agents. 

In the immune system, many of the effects of known 
cytokines on gene transcription are known to be mediated 
by cytokine inducible DNA binding proteins. See, e.g*, 

35 Paul (ed. 1994) Fundamental Immunology , 3rd ed. , Raven 
Press, New York, NY. The mechanisms of signal 
transduction have been an area of active recent study, 
and involve protein phosphorylation and dephosphorylation 
with, e.g., the Janus kinases (JAKs) and Signal 
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Transducers and Activators of Transcription (Stats) . 
See, e.g., Ihle (1996) Cell 84:331-334; ;Ivashkiv (1995) 
Immunity 3:1-4; and Ihle and Kerr (1995) Trends in 
Genetics 11:69-74. 
5 The lack of knowledge regarding the mechanisms of 

signaling involved in the regulation of cell cycle or 
transcriptional elements has hampered the ability of 
medical science to specifically regulate cell division or 
cellular responses, including immune responses. The 
10 present invention provides compositions which will be 
important in such regulation. 

SUMMARY OF THE INVENTION 

The present invention is based in part upon the 

15 discovery of intracellular regulatory molecules which can 
block signal transduction, e.g., through growth factor- 
or cytokine-receptor superfamily signaling mechanisms. 
These proteins exhibit a structural feature designated a 
SOCS box. See Hilton, et al. (1998) Proc. Nat'l Acad. 

20 Sci. USA 95:114-119. Moreover, the SOCS3 protein can 
block the IL-2 induced signaling via the STATS , 
establishing function of the SOCS proteins as suppressors 
of cytokine signaling. 

The invention provides a substantially pure or 

25 recombinant S0CS14 protein or peptide exhibiting identity 
over a length of at least about 12 amino acids to SEQ ID 
NO: 2 or 6; a natural sequence S0CS14 of SEQ ID NO: 2 or 
6; a fusion protein comprising S0CS14 sequence; a 
substantially pure or recombinant S0CS15 (also designated 

30 WDS11) protein or peptide exhibiting identity over a 

length of at least about 12 amino acids to SEQ ID NO: 4 
or 8; a natural sequence S0CS15 (WDS11) of SEQ ID NO: 4 
or 8; a fusion protein comprising S0CS15 (WDS11) 
sequence; a substantially pure or recombinant S0CS17 

35 protein or peptide exhibiting identity over a length of 

at least about 12 amino acids to SEQ ID NO: 10; a natural 
sequence SOCS17 of SEQ ID NO: 10; a fusion protein 
comprising S0CS17 sequence; a substantially pure or 
recombinant SOCS18 protein or peptide exhibiting identity 

40 over a length of at least about 12 amino acids to SEQ ID 
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NO: 12; a natural sequence SOCS18 of SEQ ID NO: 12; a 
fusion protein comprising SOCS18 sequence; a 
substantially pure or recombinant SOCS19 protein or 
peptide exhibiting identity over a length of at least 
5 about 12 amino acids to SEQ ID NO: 14; a natural sequence 
S0CS19 of SEQ ID NO: 14; a fusion protein comprising 
SOCS19 sequence; or a substantially pure or recombinant 
WDS12 protein or peptide exhibiting identity over a 
length of at least about 12 amino acids to SEQ ID NO: 16; 

10 a natural sequence WDS12 of SEQ ID NO: 16; or a fusion 
protein comprising WDS12 sequence. In preferred 
embodiments, the portion is at least about 25 amino 
acids. In other embodiments, the: SOCS14 comprises a 
mature sequence of SEQ ID NO: 2 or 6; SOCS15 (WDS11) 

15 comprises a mature sequence of SEQ ID NO: 4 or 8; S0CS17 
comprises a mature sequence of SEQ ID NO: 10; SOCS18 
comprises a mature sequence of SEQ ID NO: 12; SOCS19 
comprises a mature sequence of SEQ ID NO. 14; WDS12 
comprises a mature sequence of SEQ ID NO: 16; protein or 

20 peptide: is from a warm blooded animal selected from a 
mammal, including a primate; comprises at least one 
polypeptide segment of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 
or 16; exhibits a plurality of portions exhibiting the 
identity; is a natural allelic variant of SOCS14, S0CS15 

25 (WDS11), SOCS17, S0CS18, SOCS19, or WDS12; has a length 
at least about 30 amino acids; exhibits at least two non- 
overlapping epitopes which are specific for a mammalian 
SOCS14, S0CS15 (WDS11) , SOCS17, SOCS18, S0CS19, or WDS12; 
exhibits identity over a length of at least about 20 

30 amino acids to S0CS14, SOCS15 (WDS11), SOCS17, SOCS18, 
SOCS19, or WDS12; exhibits at least two non-overlapping 
epitopes which are specific for a S0CS14, SOCS15 (WDS11) , 
SOCS17, SOCS18, SOCS19, or WDS12; exhibits identity over 
a length of at least about 25 amino acids to a primate 

35 SOCS14, SOCS15 (WDS11) , SOCS17, SOCS18, SOCS19, orWDS12; 
is glycosylated; is a synthetic polypeptide; is attached 
to a solid substrate; is conjugated to another chemical 
moiety; is a 5-fold or less substitution from natural 
sequence; or is a deletion or insertion variant from a 

40 natural sequence. Various preferred embodiments include 
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a composition comprising: a sterile SOCS14, SOCS15 
(WDS11) , SOCS17, SOCS18, S0CS19, or WDS12 protein or 
peptide; the SOCS14, SOCS15 (WDS11) , SOCS17, SOCS18, 
SOCS19, or WDS12 protein or peptide and a carrier , 

.5 wherein the carrier is: an aqueous compound/ including 
water, saline, and/or buffer; and/or formulated for oral, 
rectal, nasal, topical, or parenteral administration. 
The invention further provides a fusion protein, 
comprising: mature protein comprising sequence of SEQ ID 

10 NO: 2, 6, 4, 8, 10, 12, 14 or 16; a detection or 

purification tag, including a FLAG, His6, or Ig sequence; 
or sequence of another SOCS or WDS protein. 

These reagents also make available a kit comprising 
such a protein or polypeptide, and: a compartment 

15 comprising the protein or polypeptide; and/or 

instructions for use or disposal of reagents in the kit. 

Providing an antigen, the invention further provides 
a binding compound comprising an antigen binding portion 
from an antibody, which specifically binds to a natural 

20 SOCS14, SOCS15 (WDS11) , S0CS17, S0CS18, SOCS19, or WDS12 
protein, wherein: the protein is a primate protein; the 
binding compound is an Fv, Fab, or Fab2 fragment; the 
binding compound is conjugated to another chemical 
moiety; or the antibody: is raised against a peptide 

25 sequence of a mature polypeptide comprising sequence of 
SEQ ID NO: 2, 6, 4, 8, 10, 12, 14 or 16; is raised 
against a mature SOCS14, S0CS15 (WDS11) , SOCS17, S0CS18, 
SOCS19, or WDS12; is raised to a purified SOCS14, SOCS15 
(WDS11) , SOCS17, SOCS18, S0CS19, orWDS12; is 

30 immunoselected; is a polyclonal antibody; binds to a 

denatured SOCS14, SOCS15 (WDS11) , SOCS17, SOCS18, SOCS19, 
or WDS12; exhibits a Kd to antigen of at least 30 \M; is 

attached to a solid substrate, including a bead or 
plastic membrane; is in a sterile composition; or is 
35 detectably labeled, including a radioactive or 

fluorescent label. Preferred kits include those 
containing the binding compound, and: a compartment 
comprising the binding compound; and/or instructions for 
use or disposal of reagents in the kit. Many of the kits 
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will be used for making a qualitative or quantitative 
analysis. 

Other preferred compositions will be those 
comprising: a sterile binding compound, or the binding 
5 compound and a carrier, wherein the carrier is: an 

aqueous compound, including water, saline, and/or buffer; 
and/or formulated for oral, rectal, nasal, topical, or 
parenteral administration. 

The present invention further provides an isolated 

10 or recombinant nucleic acid encoding a protein or peptide 
or fusion protein described above, wherein: the SOCS or 
WDS family protein is from a mammal, including a primate; 
or the nucleic acid: encodes an antigenic peptide 
sequence of SEQ ID NO: 2, 6, 4, 8, 10, 12, 14 or 16; 

15 encodes a plurality of antigenic peptide sequences of SEQ 
ID NO: 2, 6, 4, 8, 10, 12, 14 or 16; exhibits identity to 
a natural cDNA encoding the segment; is an expression 
vector; further comprises an origin of replication; is 
from a natural source; comprises a detectable label; 

20 comprises synthetic nucleotide sequence; is less than 6 
kb, preferably less than 3 kb; is from a mammal, 
including a primate; comprises a natural full length 
coding sequence; is a hybridization probe for a gene 
encoding the SOCS or WDS family protein; or is a PCR 

25 primer, PCR product, or mutagenesis primer. In certain 
embodiments, the invention provides a cell or tissue 
comprising such a recombinant nucleic acid. Preferred 
cells include: a prokaryotic cell; a eukaryotic cell; a 
bacterial cell; a yeast cell; an insect cell; a mammalian 

30 cell; a mouse cell; a primate cell; or a human cell. 

Other kit embodiments include a kit comprising the 
described nucleic acid, and: a compartment comprising the 
nucleic acid; a compartment further comprising a S0CS14, 
SOCS15 (WDS11) , SOCS17, SOCS18, S0CS19, or WDS12 protein 

35 or polypeptide; and/or instructions for use or disposal 
of reagents in the kit. In many versions, the kit is 
capable of making a qualitative or quantitative analysis. 

Other nucleic acid embodiments include those which: 
hybridize under wash conditions of 50° C and less than 

40 500 mM salt to SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, or 15; 
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exhibits identity over a stretch of at least about 30 
nucleotides to a SOCS14, SOCS15 (WDS11) , SOCS17, SOCS18, 
SOCS19, or WDS12 . In other embodiments: the wash 
conditions are at 55° C and/or 300 mM salt; 60° C and/or 
5 150 mM salt; the identity is over a stretch is at least 
55 or 75 nucleotides* 

In other embodiments, the invention provides a 
method of modulating physiology or development of a cell 
or tissue culture cells comprising introducing into such 
10 cell an agonist or antagonist of a SOCS14, SOCS15 
(WDS11), S0CS17, SOCS18, SOCS19, or WDS12 . 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

15 I . General 

It is to be understood that this invention is not 
limited to the particular compositions, methods, and 
techniques described herein, as such compositions, 
methods, and techniques may, of course, vary. It is also 

20 to be understood that the terminology used herein is for 
the purpose of describing particular embodiments only, 
and is not intended to limit the scope of the present 
invention which is only limited by the appended claims. 
As used herein, including the appended claims, 

25 singular forms of words such as "a," "an, n and "the" 

include their corresponding plural referents unless the 
context clearly dictates otherwise. Thus, e.g., 
reference to "a polynucleotide" includes one or more 
different polynucleotides, reference to "a composition" 

30 includes one or more of such compositions, and reference 
to "a method" include reference to equivalent steps and 
methods known to a person of ordinary skill in the art, 
and so forth. 

Unless otherwise defined, all technical and 

35 scientific terms used herein have the same meaning as 

commonly understood by a person of ordinary skill in the 
art to which this invention belongs. Although methods 
and materials similar or equivalent to those described 
herein can be used in the practice or testing of the 

40 present invention, suitable methods and materials are 
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described below. All publications, patent applications, 

patents, and other references discussed above are 

provided solely for their disclosure prior to the filing 

date of the present application. Nothing herein is to be 

5 construed as an admission that the invention is not 

entitled to antedate any such disclosure by virtue of its 

prior invention. All publications, patent applications, 

patents, and other references mentioned herein are 

incorporated by reference in their entirety including all 

10 figures, references, and drawings. 

The proliferation, differentiation, and 

physiological responses of many cell lineages are 

regulated by secreted proteins, e.g., cytokines. These 

molecules often exert their biological effects through 

15 binding to cell surface receptors that are associated 

with one or more members of the Janus Kinase (Jak) family 
of cytoplasmic tyrosine kinases. For example, cytokine 
induced receptor dimerization leads to the activation of 
JAKs, rapid tyrosine phosphorylation of cytoplasmic 

20 domains, and subsequent recruitment of various signaling 
proteins to the receptor complex, including members of 
the STAT family of transcription factors. The JAK and 
STAT proteins are enzymes which act to transduce a signal 
from the cell surface to the nucleus, thereby serving as 

25 the pathway to signal the cell to respond physiologically 
to an external signal. These pathways have been shown to 
involve certain protein phosphorylation or 
dephosphorylation steps, thereby leading to response or 
lack of response by the cell. See, e.g., Ihle (1996) 

30 Cell 84:331-334; Ivashkiv (1995) Immunity 3:1-4; Ihle, et 
al. (1995) Ann. Rev. Immunol. 13:369-398; Ihle and Kerr 
(1995) Trends in Genetics 11:69-74; and Darnell, et al. 
(1994) Science 264:1415-1421. 

A number of novel genes have been identified from 
35 mouse or humans which appear to inhibit STAT function. 

See, e.g., Yoshimura, et al. (1995) EMBO J. 14:2816-2826; 
Matsumoto, et al. (1997) Blood 89:3148-3154; Starr, et 
al. (1997) Nature 387:917-921; Endo, et al. (1997) Nature 
387:921-924; and Naka, et al. Nature 387:924-929. The 
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present invention provides additional genes with sequence 
related to those, designated Suppressors Of Cytokine 
Signaling or WDS: SOCS14, SOCS15 (WDS11) , SOCS17, S0CS18, 
SOCS19, or WDS12. 
5 A primate, e.g., human, SOCS14 cDNA fragment and 

corresponding open reading frame are provided in (SEQ ID NO: 1 
and 2) . The translation exhibits significant matching and 
similarity to other identified SOCS family members. The 
internal stop codon indicates some errors in the sequence at 
10 or near those positions .Additional refined sequence of 

primate, e.g., human, SOCS14 is provided in SEQ ID NO: 5 and 
6. 

A rodent, e.g., mouse, SOCS15 cDNA fragment and 
corresponding open reading frame are provided in SEQ ID 

15 NO: 3 and 4. The translation exhibits significant 

matching and similarity to other identified SOCS family 
members. The internal stop codon indicates some errors 
in the sequence at or near those positions. 

A rodent, e.g., murine SOCS17 CDNA and corresponding 

20 open reading frame are provided in SEQ ID NO: 9 and 10. 

Nucleotide may be A, C, T, or G at positions: 1680, 1691, 
1696, 1704, 1707, 1728, 1740, 1743, 1746, 1755, 1760, 
1770, 1773, 1802, 1816, 1817, 1823, 1826, 1827, 1846, 
1851, 1857, 1861, 1880, 1885, 1909, 1917, 1920, 1929, 

25 1946, 1953, 1967, 1968, 1980, 1991, 1995, 2001, 2004, 
2021, 2033, 2034,' 2035, 2036, 2037, 2039, 2040, 2042, 
2048, 2051, 2054, 2061, 2075, 2081, 2083, 2084, 2085, 
2088, 2105, 2121, 2124, 2132, 2137, 2147, 2149, 2151, 
2152, 2160, 2165, 2177, 2179 and 2196; nucleotide may be 

30 A or C at position 494; nucleotide may be C or T at 

positions: 498, 501, 1455, 1524, 1527, 1621, 1829, and 
2072; nucleotide may be G or C at positions: 499, 1618, 
and 1664; nucleotide may be G or T at position 1673; and 
nucleotide may be A, C, or G at positions: 1819, 1840, 

35 and 2089 (see SEQ ID NO: 26). 

A primate, e.g., human, SOCS18 nucleotide and 
corresponding amino acid sequence are provided in SEQ ID 
NO: 11 and 12. Nucleotide may be A or C at positions: 
740, 797, 2139, and 2184; nucleotide may be G or T at 

40 positions: 761, 1313, 1508, and 2226; nucleotide may be C 
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or T at positions 746, 1460, 1499, 2009, 2010, 2199, and 
2225; nucleotide may be A or G at positions 788, 863, 
1550, 2178, 2188, 2197, and 2211; nucleotide may be G or 
C at positions: 1163, and 1544; nucleotide may be A or T 
5 at positions 2058, and 2128; and nucleotide may be A, C, 
T, or G at position 2251 (see SEQ ID NO: 27) . 

A primate, e.g., human, SOCS19 nucleotide and 
corresponding amino acid sequence are provided in SEQ ID 
NO: 13 and 14. Nucleotide may be A, C, T, or G at 
10 positions: 2078, and 2116; and nucleotide may be G or C 
at position 2063 (see SEQ ID NO: 28) . 

Finally, a primate, e.g., human, WDS12 nucleotide 
and corresponding amino acid sequence is provided in SEQ 
ID NO: 15 and 16. Nucleotide may be A, C, T, or G at 

15 positions: 108, and 109; nucleotide may be A or G at 

positions: 236, 238, and 1258; nucleotide may be G or T 
at position 233; nucleotide may be G or C at position 
234; nucleotide may be C or T at position 237; and 
nucleotide may be A or T at position 239 (see SEQ ID NO: 

20 29). 

SOCS proteins are a family of proteins ranging from 
approximately 30-60 Kd which inhibit JAK kinase activity. 
The amino portion of SOCS proteins contain an SH2 binding 
motif and the carboxy portion of the molecule contains a 

25 SOCS box motif which may play a role in dimerization of 
SOCS proteins. The WDS are closely related in sequence. 

SOCS3 expression is induced by IL-2 and can be 
detected by approximately 1 hour after IL-2 activation. 
Subsequently, SOCS expression is decreased relatively 

30 rapidly (e.g., approximately 8 hrs after activation). 
Western blots show that SOCS 3 interacts with IL-2 
receptor and JAK1 following IL-2 stimulation. 

II. Definitions 

35 The term "binding composition" refers to molecules 

that bind with specificity to SOCS14, SOCS15 (WDS11) , 
SOCS17, SOCS18, SOCS19, or WDS12 protein, e.g., in an 
antibody-antigen interaction. However, other compounds, 
e.g., binding proteins, may also specifically associate 
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with SOCS14, SOCS15 (WDS11) , SOCS17, SOCS18, SOCS19, or 
WDS12 proteins in contrast to other molecules. 
Typically, the association will be in a natural 
physiologically relevant protein-protein interaction, 
5 either covalent or non-covalent , and may include members 
of a multiprotein complex, including carrier compounds or 
dimerization partners. The molecule may be a polymer, or 
chemical reagent. A functional analog may be a protein 
with structural modifications, or may be a wholly 

10 unrelated molecule, e.g., which has a molecular shape 
which interacts with the appropriate protein binding 
determinants. The proteins may serve as agonists or 
antagonists of the binding partner, see, e.g., Goodman, 
et al. (eds.) (1990) Goodman & Gilman's: The 

15 Pharmacological Bases of Therapeutics (8th ed.) Pergamon 
Pr es s , Tarry town , N . Y . 

The term "binding agent: SOCS or :WDS protein 
complex", as used herein, refers to a complex of a 
binding agent and a SOCS14, S0CS15 (WDS11) , SOCS17, 

20 SOCS18, SOCS19, or WDS12 protein that is formed by 

specific binding of the binding agent to the respective 
SOCS14, SOCS15 (WDS11), SOCS17, S0CS18, S0CS19, or WDS12 
protein. Specific binding of the binding agent means 
that the binding agent has a specific binding site that 

25 recognizes a site on the SOCS14, S0CS15 (WDS11) , SOCS17, 
SOCS18, SOCS19, or WDS12 protein. For example, 
antibodies raised to a SOCS14, S0CS15 (WDS11) , SOCS17, 
SOCS18, SOCS19, or WDS12 protein and recognizing an 
epitope on the SOCS or WDS protein are capable of forming 

30 a binding agent: SOCS or :WDS protein complex by specific 
binding. Typically, the formation of a binding agent: 
SOCS or :WDS protein complex allows the measurement of 
SOCS14, S0CS15 (WDS11), SOCS17, S0CS18, S0CS19, or WDS12 
protein in a mixture of other proteins and biologies. 

35 The term "antibody: SOCS or :WDS protein complex" refers 
to an embodiment in which the binding agent, e.g., is an 
antibody. The antibody may be monoclonal, polyclonal, or 
a binding fragment of an antibody, e.g., an Fv, Fab, or 
F(ab)2 fragment. The antibody will preferably be a 

40 polyclonal antibody for cross-reactivity purposes. 



WO 99/03993 



12 



PCT/US98/14544 



"Homologous" nucleic acid sequences, when compared, 
exhibit significant similarity, or identity. The 
standards for homology in nucleic acids are either 
measures for homology generally used in the art by 
5 sequence comparison and/or phylogenetic relationship, or 
based upon hybridization conditions. Hybridization 
conditions are described in greater detail below. 

An "isolated" nucleic acid is a nucleic acid, e.g., 
an RNA, DNA, cDNA, genomic DNA, or a mixed polymer, which 

10 is substantially separated from other biologic components 
which naturally accompany a native sequence, e.g., 
proteins and flanking genomic sequences from the 
originating species. The term embraces a nucleic acid 
sequence which has been removed from its naturally 

15 occurring environment, and includes recombinant or cloned 
DNA isolates and chemically synthesized analogs, or 
analogs biologically synthesized by heterologous systems. 
Further, the term includes double-stranded or single- 
stranded embodiments. Where single-stranded, the nucleic 

20 acid may be either the "sense" or the "antisense" strand. 
A substantially pure molecule includes isolated forms of 
the molecule. An isolated nucleic acid will usually 
contain homogeneous nucleic acid molecules, but will, in 
some embodiments, contain nucleic acids with minor 

25 sequence heterogeneity. This heterogeneity is typically 
found at the polymer ends or portions not critical to a 
desired biological function or activity. 

As used herein, the terms "SOCS" or "WDS" protein 
shall encompass, when used in a protein context, a 

30 protein having amino acid sequences shown in SEQ ID NO: 
2, 4, 6, 8, 10, 12, 14, or 16 or a significant fragment 
of such a protein, preferably a natural embodiment. The 
term "protein" or "polypeptide" is meant any chain of 
amino acids, regardless of length or postranslation 

35 modification (e.g., glycosylation or phosphorylation) . 

Further, the term encompasses polypeptides which are pre- 
or pro-proteins. The invention also embraces a 
polypeptide which exhibits similar structure to SOCS14, 
S0CS15 (WDS11), SOCS17, SOCS18, SOCS19, or WDS12 protein, 
40 e.g., which interacts with SOCS or WDS protein specific 
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binding components. These binding components, e.g., 
antibodies, typically bind to a SOCS or WDS protein, 
respectively, with high affinity, e.g., at least about 
100 nM, usually better than about 30 nM, preferably 
5 better than about 10 nM, and more preferably at better 
than about 3 nM. 

The term "polypeptide" or "protein" as used herein 
includes a significant fragment or segment of a SOCS or 
WDS protein, and encompasses a stretch of amino acid 

10 residues of at least about 8 amino acids, generally at 
least 10 amino acids, more generally at least 12 amino 
acids, often at least 14 amino acids, more often at least 
16 amino acids, typically at least 18 amino acids, more 
typically at least 20 amino acids, usually at least 22 

15 amino acids, more usually at least 24 amino acids, 

preferably at least 26 amino acids, more preferably at 
least 28 amino acids, and, in particularly preferred 
embodiments, at least about 30 or more amino acids, e.g., 
35, 40, 45, 50, 60, 70, 80, etc. The invention 

20 encompasses proteins comprising a plurality of distinct, 
e.g., nonoverlapping, segments of the specified length. 
Typically, the plurality will be at least two, more 
usually at least three, and preferably 5, 7, or even 
more. While the length minima are provided, longer 

25 lengths, of various sizes, may be appropriate, e.g., one 
of length 7, and two of length 12. Features of one of 
the different genes should not be taken to limit those of 
another of the genes. 

A "recombinant" nucleic acid is defined either by 

30 its method of production or its structure. In reference 
to its method of production, e.g., a product made by a 
process, the process is use of recombinant nucleic acid 
techniques, e.g. , involving human intervention in the 
nucleotide sequence, typically selection or production. 

35 Alternatively, it can be a nucleic acid made by - 

generating a sequence comprising fusion of two fragments 
which are not naturally contiguous to each other, but is 
meant to exclude products of nature, e.g. , naturally 
occurring mutants. Thus, for example; products made by 

40 transforming cells with any non-naturally occurring 



WO 99/03993 



14 



PCT7US98/14544 



vector is encompassed, as are nucleic acids comprising 
sequence derived using any synthetic oligonucleotide 
process. Such is often done to replace a codon with a 
redundant codon encoding the same or a conservative amino 
5 acid, while typically introducing or removing a sequence 
recognition site. Alternatively, it is performed to join 
together nucleic acid segments of desired functions to 
generate a single genetic entity comprising a desired 
combination of functions not found in the commonly 

10 available natural forms. Restriction enzyme recognition 
sites are often the target of such artificial 
manipulations, but other site specific targets, e.g., 
promoters, DNA replication sites, regulation sequences, 
control sequences, or other useful features may be 

15 incorporated by design. A similar concept is intended 
for a recombinant, e.g., fusion, polypeptide. 
Specifically included are synthetic nucleic acids which, 
by genetic code redundancy, encode polypeptides similar 
to fragments of these antigens, and fusions of sequences 

20 from various different species variants. 

"Solubility" is reflected by sedimentation measured 
in Svedberg units, which are a measure of the 
sedimentation velocity of a molecule under particular 
conditions. The determination of the sedimentation 

25 velocity was classically performed in an analytical 
ultracentrifuge, but is typically now performed in a 
standard ultracentrifuge. See, Freif elder (1982) 
Physical Biochemistry (2d ed. ) W.H. Freeman & Co., San 
Francisco, CA; and Cantor and Schimmel (1980) Biophysical 

30 Chemistry parts 1-3, W.H. Freeman & Co., San Francisco, 
CA. As a crude determination, a sample containing a 
putatively soluble polypeptide is spun in a standard full 
sized ultracentrifuge at about 50K rpm for about 10 
minutes, and soluble molecules will remain in the 

35 supernatant. A soluble particle or polypeptide will 
typically be less than about 30S, more typically less 
than about 15S, usually less than about 10S, more usually 
less than about 6S, and, in particular embodiments, 
preferably less than about 4S, and more preferably less 

40 than about 3S. Solubility of a polypeptide or fragment 
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depends upon the environment and the polypeptide. Many 
parameters affect polypeptide solubility, including 
temperature, electrolyte environment, size and molecular 
characteristics of the polypeptide, and nature of the 
5 solvent. Typically, the temperature at which the 

polypeptide is used ranges from about 4° C to about 65° 
C. Usually the temperature at use is greater than about 
18° C and more usually greater than about 22° C. For 
diagnostic purposes, the temperature will usually be 

10 about room temperature or warmer, but less than the 

denaturation temperature of components in the assay. For 
therapeutic purposes, the temperature will usually be 
body temperature, typically about 37° C for humans, 
though under certain situations the temperature may be 

15 raised or lowered in situ or in vitro. 

The size and structure of the polypeptide should 
generally be in a substantially stable state, and usually 
not in a denatured state. The polypeptide may be 
associated with other polypeptides in a quaternary 

20 structure, e.g., to confer solubility, or associated with 
lipids or detergents in a manner which approximates 
natural lipid bilayer interactions. 

The solvent will usually be a biologically 
compatible buffer, of a type used for preservation of 

25 biological activities, and will usually approximate a 
physiological solvent. Usually the solvent will have a 
neutral pH, typically between about 5 and 10, and 
preferably about 7.5. On some occasions, a detergent 
will be added, typically a mild non-denaturing one, e.g., 

30 CHS (cholesteryl hemisuccinate) or CHAPS (3- [3- 

cholamidopropyl) -dimethylaramonio] -1-propane sulfonate) , 
or a low enough concentration as to avoid significant 
disruption of structural or physiological properties of 
the protein. 

35 "Substantially pure" in a protein context typically 

means that the protein is isolated from other 
contaminating proteins, nucleic acids, and other 
biologicals derived from the original source organism. 
Purity, or "isolation" may be assayed by standard 

40 methods, and will ordinarily be at least about 50% pure, 
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more ordinarily at least about 60% pure # generally at 
least about 70% pure, more generally at least about 80% 
pure, often at least about 85% pure, more often at least 
about 90% pure, preferably at least about 95% pure, more 
5 preferably at least about 98% pure, and in most preferred 
embodiments, at least 99% pure. Similar concepts apply, 
e.g., to antibodies or nucleic acids. 

"Substantial similarity" in the nucleic acid 
sequence comparison context means either that the 

10 segments, or their complementary strands, when compared, 
are identical when optimally aligned, with appropriate 
nucleotide insertions or deletions, in at least about 50% 
of the nucleotides, generally at least 56%, more 
generally at least 59%, ordinarily at least 62%, more 

15 ordinarily at least 65%, often at least 68%, more often 
at least 71%, typically at least 74%, more typically at 
least 77%, usually at least 80%, more usually at least 
about 85%, preferably at least about 90%, more preferably 
at least about 95 to 98% or more, and in particular 

20 embodiments, as high at about 99% or more of the 

nucleotides. Alternatively, substantial similarity 
exists when the segments will hybridize under selective 
hybridization conditions, to a strand, or its complement, 
typically using a sequence derived from SEQ ID NO: 1, 3, 

25 5, 7, 9, 11, 13, or 15. Typically, selective 

hybridization will occur when there is at least about 55% 
similarity over. a. stretch of at least about 30 
nucleotides, preferably at least about 65% over a stretch 
of at least about 25 nucleotides, more preferably at 

30 least about 75%, and most preferably at least about 90% 
over about 20 nucleotides. See Kanehisa (1984) Nuc. 
Acids Res. 12:203-213. The length of similarity 
comparison, as described, may be over longer stretches, 
and in certain embodiments will be over a stretch of at 

35 least about 17 nucleotides, usually at least about 20 

nucleotides, more usually at least about 24 nucleotides, 
typically at least about 28 nucleotides, more typically 
at least about 40 nucleotides, preferably at least about 
50 nucleotides, and more preferably at least about 75 to 

40 100 or more nucleotides, e.g., 150, 200, etc. 
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For sequence comparison, typically one sequence acts 
as a reference sequence, to which test sequences are 
compared. When using a sequence comparison algorithm, 
test and reference sequences are input into a computer, 
5 subsequent coordinates are designated, if necessary, and 
sequence algorithm program parameters are designated. 
The sequence comparison algorithm then calculates the 
percent sequence identity for the test sequence (s) 
relative to the reference sequence, based on the 
10 designated program parameters. 

Optical alignment of sequences for comparison can be 
conducted, e.g., by the local homology algorithm of Smith 
and Waterman (1981) Adv. Appl. Math. 2:482, by the 

homology alignment algorithm of Needlman and Wunsch 

15 (1970) J. Mol. Biol. 48:443, by the search for similarity 
method of Pearson and Lipman (1988) Proc. Nat'l Acad. 
Sci . USA 85:2444, by computerized implementations of 
these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package, Genetics Computer 

20 Group, 575 Science Dr., Madison, WI) , or by visual 
inspection (see generally Ausubel et al., supra). 

One example of a useful algorithm is PILEUP. PILEUP 
creates a multiple sequence alignment from a group of 
related sequences using progressive, pairwise alignments 

25 to show relationship and percent sequence identity. It 
also plots a tree or dendogram showing the clustering 
relationships used to create the alignment. PILEUP uses 
a simplification of the progressive alignment method of 
Feng and Doolittle (1987) J. Mol. Evol. 35:351-360. The 

30 method used is similar to the method described by Higgins 
and Sharp (1989) CABIOS 5:151-153. The program can align 

up to 300 sequences, each of a maximum length of 5,000 
nucleotides or amino acids. The multiple alignment 
procedure begins with the pairwise alignment of the two 

35 most similar sequences, producing a cluster of two 

aligned sequences. This cluster is then aligned to the 
next most related sequence or cluster of aligned 
sequences. Two clusters of sequences are aligned by a 
simple extension of the pairwise alignment of two 

40 individual sequences. The final alignment is achieved by 
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a series of progressive, pairwise alignments. The 
program is run by designating specific sequences and 
their amino acid or nucleotide coordinates for regions of 
sequence comparison and by designating the program 
5 parameters. For example, a reference sequence can be 

compared to other test sequences to determine the percent 
sequence identity relationship using the following 
parameters: default gap weight (3.00)/ default gap length 
weight (0.10), and weighted end gaps. 
10 Another example of algorithm that is suitable for 

determining percent sequence identity and sequence 
similarity is the BLAST algorithm, which is described 
Altschul, et al. (1990) J. Mol. Biol. 215:403-410. 

Software for performing BLAST analyses is publicly 

15 available through the National Center for Biotechnology 

Information (http:www.ncbi.nlm.nih.gov/) . This algorithm 
involves first identifying high scoring sequence pairs 
(HSPs) by identifying short words of length W in the 
query sequence, which either match or satisfy some 

20 positive-valued threshold score T when aligned with a 
word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold 
(Altschul, et al., supra). These initial neighborhood 
word hits act as seeds for initiating searches to find 

25 longer HSPs containing them. The word hits are then 
extended in both directions along each sequence for as 
far as the cumulative alignment- score can be increased. . 
Extension of the word hits in each direction are halted 
when: the cumulative alignment score falls off by the 

30 quantity X from its maximum achieved value; the 

cumulative score goes to zero or below, due to the 
accumulation of one or more negative- scoring residue 
alignments; or the end of either sequence is reached. 
The BLAST algorithm parameters W, T, and X determine the 

35 sensitivity and speed of the alignment. The BLAST 

program uses as defaults a word length (W) of 11, the 
BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) 
Proc. Nat'l Acad. Sci. USA 89:10915) alignments (B) of 

50, expectation (E) of 10, M=5, N=4, and a comparison of 
40 both strands. 
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In addition to calculating percent sequence 
identity, the BLAST algorithm also performs a statistical 
analysis of the similarity between two sequences (see, 
e.g., Karlin and Altschul (1993) Proc. Nat 'l Acad. Sci. 
5 USA 90:5873-5787). One measure of similarity provided by 
the BLAST algorithm is the smallest sum probability 
(P(N)), which provides an indication of the probability 
by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a nucleic 

10 acid is considered similar to a reference sequence if the 
smallest sum probability in a comparison of the test 
nucleic acid to the reference nucleic acid is less than 
about 0.1, more preferably less than about 0.01, and most 
preferably less than about 0.001. 

15 A further indication that two nucleic acid sequences 

of polypeptides are substantially identical is that the 
polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the polypeptide 
encoded by the second nucleic acid, as described below. 

20 Thus, a polypeptide is typically substantially identical 
to a second polypeptide, for example, where the two 
peptides differ only by conservative substitutions. 
Another indication that two nucleic acid sequences are 
substantially identical is that the two molecules 

25 hybridize to each other under stringent conditions, as 
described below. 

"Stringent conditions", in referring to homology or 
substantial similarity in 'the hybridization context, will 
be stringent combined conditions of salt, temperature, 

30 organic solvents, and other parameters, typically those 
controlled in hybridization reactions. The combination 
of parameters is more important than the measure of any 
single parameter. See, e.g., Wetmur and Davidson (1968) 
J. Mol. Biol. 31:349-370. 

35 a nucleic acid probe which binds to a target nucleic 

acid under stringent conditions is specific for said 
target nucleic acid. Hybridization under stringent 
conditions should give a background of at least 2-fold 
over background, preferably at least 3-5 or more. Such a 

40 probe is typically more than 11 nucleotides in length, 



WO 99/03993 



PCI7US98/14544 

20 



and is sufficiently identical or complementary to a 
target nucleic acid over the region specified by the 
sequence of the probe to bind the target under stringent 
hybridization conditions. 
5 SOCS14, S0CS15 (WDS11) , SOCS17, SOCS18, SOCS19, or 

WDS12 protein from other mammalian species can be cloned 
and isolated by cross-species hybridization of closely 
related species. See, e.g., below. Similarity may be 
relatively low between distantly related species, and 

10 thus hybridization of relatively closely related species 
is advisable. Alternatively, preparation of an antibody 
preparation which exhibits less species specificity may 
be useful in expression cloning approaches. 

The phrase "specifically binds to an antibody" or 

15 "specifically immunoreactive with", when referring to a 
protein or peptide, refers to a binding reaction which is 
determinative of the presence of the protein in the 
presence of a heterogeneous population of proteins and 
other biological components. Thus, under designated 

20 immunoassay conditions, the specified antibodies bind to 
a particular protein and do not significantly bind other 
proteins present in the sample. Specific binding to an 
antibody under such conditions may require an antibody 
that is selected for its specificity for a particular 

25 protein. For example, antibodies raised to the protein 

immunogen with the amino acid sequence depicted in SEQ ID 
NO: 2, 4, 6, 8, 10, 12,- 14, or 16 can be selected to 
obtain antibodies specifically immunoreactive with SOCS 
or WDS proteins and not with other proteins. These 

30 antibodies recognize proteins highly similar to the 
homologous SOCS or WDS protein. 

III. Nucleic Acids 

Primate or rodent SOCS or WDS protein is each 
35 exemplary of a larger class of structurally and . 

functionally related proteins. These soluble proteins 
will serve to transmit signals between different cell 
types. The preferred embodiments, as disclosed, will be 
useful in standard procedures to isolate genes from 
40 different individuals or other species, e.g., warm 
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blooded animals, such as birds and mammals. Cross 
hybridization will allow isolation of related genes 
encoding proteins from individuals, strains, or species. 
A number of different approaches are available to 
5 successfully isolate a suitable nucleic acid clone based 
upon the information provided herein. Southern blot 
hybridization studies can qualitatively determine the 
presence of homologous genes in human, monkey, rat, 
mouse, dog, cow, and rabbit genomes under specific 

10 hybridization conditions. 

Complementary sequences will also be used as probes 
or primers. Based upon identification of the likely 
amino terminus, other peptides should be particularly 
useful, e.g., coupled with anchored vector or poly-A 

15 complementary PCR techniques or with complementary DNA of 
other peptides. 

Techniques for nucleic acid manipulation of genes 
encoding SOCS or WDS proteins, such as subcloning nucleic 
acid sequences encoding polypeptides into expression 

20 vectors, labeling probes, DNA hybridization, and the like 
are described generally in Sambrook, et al. (1989) 
Molecular Cloning: A Laboratory Manual (2nd ed. ) Vol. 1- 

3, Cold Spring Harbor Laboratory, Cold Spring Harbor 
Press, NY, which is incorporated herein by reference. 
25 This manual is hereinafter referred to as "Sambrook, et 
al. n 

There are various methods of isolating DNA sequences 
encoding SOCS or WDS proteins. For example, DNA is 
isolated from a genomic or cDNA library using labeled 

30 oligonucleotide probes having sequences identical or 

complementary to the sequences disclosed herein. Full- 
length probes may be used, or oligonucleotide probes may 
be generated by comparison of the sequences disclosed. 
Such probes can be used directly in hybridization assays 

35 to isolate DNA encoding SOCS or WDS proteins, or .probes 

* 

can be designed for use in amplification techniques such 
as PCR, for the isolation of DNA encoding SOCS or WDS 
proteins . 

To prepare a cDNA library, mRNA is isolated from 
40 cells which expresses a SOCS or WDS protein. cDNA is 
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prepared from the mRNA and ligated into a recombinant 
vector. The vector is transfected into a recombinant 
host for propagation, screening, and cloning. Methods 
for making and screening cDNA libraries are well known. 
5 See Gubler and Hoffman (1983) Gene 25:263-269 and 

Sambrook, et al. 

For a genomic library, the DNA can be extracted from 
tissue and "either mechanically sheared or enzymatically 
digested to yield fragments of about 12-20 kb. The 

10 fragments are then separated by gradient centrifugation 
and cloned in bacteriophage lambda vectors. These 
vectors and phage are packaged in vitro, as described in 
Sambrook, et al. Recombinant phage are analyzed by 
plaque hybridization as described in Benton and Davis 

15 (1977) Science 196:180-182. Colony hybridization is 

carried out as generally described in e.g., Grunstein, et 
al. (1975) Proc. Natl. Acad. Sci. USA. 72:3961-3965. 

DNA encoding a SOCS14 or SOCS15 protein can be 
identified in either cDNA or genomic libraries by its 

20 ability to hybridize with the nucleic acid probes 

described herein, e.g., in colony or plaque hybridization 
assays. The corresponding DNA regions are isolated by 
standard methods familiar to those of skill in the art. 
See, e.g., Sambrook, et al. 

25 Various methods of amplifying target sequences, such 

as the polymerase chain reaction, can also be used to 
prepare DNA encoding.. SOCS or WDS proteins . Polymerase 
chain reaction (PCR) technology is used to amplify such 
nucleic acid sequences directly from mRNA, from cDNA, and 

30 from genomic libraries or cDNA libraries. The isolated 
sequences encoding SOCS or WDS proteins may also be used 
as templates for PCR amplif ication. 

Typically, in PCR techniques, oligonucleotide 
primers complementary to two 5 1 regions in the DNA region 

35 to be amplified are synthesized. The polymerase .chain 
reaction is then carried out using the two primers. See 
Innis, et al. (eds.) (1990) PCR Protocols: A Guide to 
Methods and Applications Academic Press, San Diego, CA. 
Primers can be selected to amplify the entire regions 

40 encoding a full-length SOCS or WDS protein or to amplify 



WO 99/03993 



23 



PCT/US98/14544 



smaller DNA segments as desired. Once such regions are 
PCR-amplified, they can be sequenced and oligonucleotide 
probes can be prepared from sequence obtained using 
standard techniques. These probes can then be used to 
5 isolate DNA's encoding SOCS or WDS proteins. 

Oligonucleotides for use as probes are usually 
chemically synthesized according to the solid phase 
phosphoramidite triester method first described by 
Beaucage and Carruthers (1983) Tetrahedron Lett. 

10 22(20:1859-1862, or using an automated synthesizer, as 
described in Needham-VanDevanter , et al. (1984) Nucleic 
Acids Res. 12:6159-6168. Purification of 
oligonucleotides is performed e.g., by native acrylamide 
gel electrophoresis or by anion-exchange HPLC as 

15 described in Pearson and Regnier (1983) J . Chrom . 

255:137-149. The sequence of the synthetic 
oligonucleotide can be verified using, e.g., the chemical 
degradation method of Maxam, A.M. and Gilbert, W. in 
Grossman, L. and Moldave (eds.) (1980) Methods in 

20 Enzvmolocrv 65:499-560 Academic Press, New York. 

Isolated nucleic acids encoding SOCS or WDS proteins 
were identified. The nucleotide sequences and 
corresponding open reading frames are provided in SEQ ID 
NO: 1 through 16. 

25 These SOCS or WDS proteins exhibit limited 

similarity to portions other intracellular proteins. In 
particular, P-sheet and a-helix residues can be 

determined using, e.g., RASMOL program, see Sayle and 
Milner-White (1995) TIBS 20:374-376; or Gronenberg, et 

30 al. (1991) Protein Engineering 4:263-269; and other 

structural features are defined in Lodi, et al. (1994) 
Science 263:1762-1767. 

This invention provides isolated DNA or fragments to 
encode a SOCS or WDS protein. In addition, this 

35 invention provides isolated or recombinant DNA which 
encodes a protein or polypeptide which is capable of 
hybridizing under appropriate conditions, e.g., high 
stringency, with the DNA sequences described herein. 
Said biologically active protein or polypeptide can be an 

40 intact protein, or fragment, and have an amino acid 
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sequence as disclosed in SEQ ID NO: 2, 4, 6, 8, 10, 12, 
14, or 16, particularly natural embodiments . Preferred 
embodiments will be full length natural sequences. 
Further, this invention contemplates the use of isolated 
5 or recombinant DNA, or fragments thereof, which encode 

proteins which are homologous to a SOCS or WDS protein or 
which were isolated using cDNA encoding a SOCS or WDS 
protein as a probe. The isolated DNA can have the 
respective regulatory sequences in the 5* and 3" flanks, 

10 e.g., promoters, enhancers, poly- A addition signals, and 
others. Also embraced are methods for making expression 
vectors with these sequences, or for making, e.g., 
expressing and purifying, protein products. 

A DNA which codes for a SOCS or WDS protein will be 

15 particularly useful to identify genes, mRNA, and cDNA 
species which code for related or similar proteins, as 
well as DNAs which code for homologous proteins from 
different species. There are likely homologs in other 
species, including primates, rodents, canines, felines, 

20 and birds. Various SOCS or WDS proteins should be 

homologous and are encompassed herein. However, even 
proteins that have a more distant evolutionary 
relationship to the antigen can readily be isolated under 
appropriate conditions using these sequences if they are 

25 sufficiently homologous. Primate SOCS or WDS proteins 
are of particular interest. 

Recombinant clones derived from the genomic 
sequences, e.g., containing introns, will be useful for 
transgenic studies, including, e.g., transgenic cells and 

30 organisms, and for gene therapy. See, e.g., Goodnow 

(1992) "Transgenic Animals" in Roitt (ed.) Encyclopedia 
of Immunology, Academic Press, San Diego, pp. 1502-1504; 
Travis (1992) Science 256:1392-1394; Kuhn, et al. (1991) 
Science 254:707-710; Capecchi (1989) Science 244:1288; 

35 Robertson (1987) (ed.) Teratocarcinomas and Embryonic Stem 
Cells: A Practical Approach. IRL Press, Oxford; and 
Rosenberg (1992) J. Clinical Oncology 10:180-199. 



40 
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IV . Ant ibodies 

Antibodies can be raised to various SOCS14 or SOCS15 
proteins, including individual, polymorphic, allelic, 
strain, or species variants, and fragments thereof, both 
5 in their naturally occurring (full-length) forms and in 
their recombinant forms. Additionally, antibodies can be 
raised to SOCS or WDS proteins in either their active 
forms or in their inactive forms. Anti-idiotypic 
antibodies may also be used. 

10 A. Antibody Production 

A number of immunogens may be used to produce 
antibodies specifically reactive with SOCS or WDS 
proteins. Recombinant protein is the preferred immunogen 
for the production of monoclonal or polyclonal 

15 antibodies. Naturally occurring protein may also be used 
either in pure or impure form. Synthetic peptides, made 
using the human S0CS14 or S0CS15 protein sequences 
described herein, may also used as an immunogen for the 
production of antibodies to SOCS14 or SOCS15 proteins. 

20 Recombinant protein can be expressed in eukaryotic or 
prokaryotic cells as described herein, and purified as 
described. Naturally folded or denatured material can be 
used, as appropriate, for producing antibodies* Either 
monoclonal or polyclonal antibodies may be generated for 

25 subsequent use in immunoassays to measure the protein. 

Methods of producing polyclonal antibodies are known 
to those of skill in the art. Typically, an immunogen, 
preferably a purified protein, is mixed with an adjuvant 
and animals are immunized with the mixture. The animal's 

30 immune response to the immunogen preparation is monitored 
by taking test bleeds and determining the titer of 
reactivity to the SOCS or WDS protein of interest. When 
appropriately high titers of antibody to the immunogen 
are obtained, usually after repeated immunizations, blood 

35 is collected from the animal and antisera are prepared. 
Further fractionation of the antisera to enrich for 
antibodies reactive to the protein can be done if 
desired. See, e:g., Harlow and Lane; or Coligan. 

Monoclonal antibodies may be obtained by various 

40 techniques familiar to those skilled in the art. 
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Typically, spleen cells from an animal immunized with a 
desired antigen are immortalized, commonly by fusion with 
a myeloma cell (see, Kohler and Milstein (1976) Eur . J . 
Immunol . 6:511-519, incorporated herein by reference) . 
5 Alternative methods of immortalization include 

i 

transformation with Epstein Barr Virus, oncogenes, or 

retroviruses, or other methods known in the art. 

Colonies arising from single immortalized cells are 

screened for production of antibodies of the desired 
10 specificity and affinity for the antigen, and yield of 

the monoclonal antibodies produced by such cells may be 

enhanced by various techniques, including injection into 

the peritoneal cavity of a vertebrate host. 

Alternatively, one may isolate DNA sequences which encode 
15 a monoclonal antibody or a binding fragment thereof by 

screening a DNA library from human B cells according, 

e.g., to the general protocol outlined by Huse, et al. 

(1989) Science 246:1275-1281. 

Antibodies, including binding fragments and single 

20 chain versions, against predetermined fragments of SOCS 

or WDS protein can be raised by immunization of animals 

with conjugates of the fragments with carrier proteins as 

described above. Monoclonal antibodies are prepared from 

cells secreting the desired antibody. These antibodies 

25 can be screened for binding to normal or defective SOCS 

or WDS proteins, or screened for agonistic or 

antagonistic activity, e.g., effect on cell cycle 

progression or transcription of specific genes. These 

monoclonal antibodies will usually bind with at least a 
30 of about 1 mM, more usually at least about 300 JIM, 

typically at least about 10 JIM, more typically at least 
about 30 HM, preferably at least about 10 pM, and more 
preferably at least about 3 flM or better. 

In some instances, it is desirable to prepare 

35 monoclonal antibodies from various mammalian hosts, such 
as mice, rodents, primates, humans, etc. Description of 
techniques for preparing such monoclonal antibodies may 
be found in, e.g., Stites, et al. (eds.) Basic and 
Clinical Immunology (4th ed.) Lange Medical Publications, 

40 Los Altos, CA, and references cited therein; Harlow and 
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Lane (1988) Antibodies: A Laboratory Manual CSH Press; 
Goding (1986) Monoclonal Antibodies: Principles and 
Practice (2d ed.) Academic Press, New York, NY; and 
particularly in Kohler and Milstein (1975) Nature 
5 256:495-497, which discusses one method of generating 
monoclonal antibodies. Summarized briefly, this method 
involves injecting an animal with an immunogen. The 
animal is then sacrificed and cells taken from its 
spleen, which are then fused with myeloma cells. The 

10 result is a hybrid cell or "hybridoma" that is capable of 
reproducing in vitro. The population of hybridomas is 
then screened to isolate individual clones, each of which 
secrete a single antibody species to the immunogen. In 
this manner, the individual antibody species obtained are 

15 the products of immortalized and cloned single B cells 
from the immune animal generated in response to a 
specific site recognized on the immunogenic substance. 

Other suitable techniques involve selection of 
libraries of antibodies in phage or similar vectors. 

20 See, e.g., Huse, et al. (1989) "Generation of a Large 

Combinatorial Library of the Immunoglobulin Repertoire in 
Phage Lambda," Science 246:1275-1281; and Ward, et al. 
(1989) Nature 341:544-546. The polypeptides and 
antibodies of the present invention may be used with or 

25~ without modification, including chimeric or humanized 

antibodies. Frequently, the polypeptides and antibodies 
will be labeled by joining, either covalently or non- 
covalently, a substance which provides for a detectable 
signal. A wide variety of labels and conjugation 

30 techniques are known and are reported extensively in both 
the scientific and patent literature. Suitable labels 
include radionuclides, enzymes, substrates, cof actors, 
inhibitors, fluorescent moieties, chemiluminescent 
moieties, magnetic particles, and the like. Patents, 

35 teaching the use of such labels include U.S. Patent Nos. 
3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 
4,275,149; and 4,366,241. Also, recombinant 
immunoglobulins may be produced. See, Cabilly, U.S. 
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Patent No. 4,816,567; and Queen, et al. (1989) Proc. 
Nat'l Acad. Sci. USA 86:10029-10033. 

* 

The antibodies of this invention are useful for 
affinity chromatography in isolating SOCS or WDS protein. 
5 Columns can be prepared where the antibodies are linked 
to a solid support, e.g., particles, such as agarose, 
SEPHADEX, or the like, where a cell lysate or supernatant 
may be passed through the column, the column washed, 
followed by increasing concentrations of a mild 
10 denaturant, whereby purified SOCS or WDS protein will be 
released. 

The antibodies may also be used to screen expression 
libraries for particular expression products. Usually 
the antibodies used in such a procedure will be labeled 

15 with a moiety allowing easy detection of presence of 
antigen by antibody binding. 

Antibodies to SOCS or WDS proteins may be used for 
the identification of cell populations expressing the 
proteins. By assaying, e.g., by histology or otherwise, 

20 probably a disruptive assay which kills that sample of 
cells, the expression products of cells expressing SOCS 
or WDS proteins it is possible to diagnose disease, e.g., 
cancerous conditions. 

Antibodies raised against each SOCS or WDS protein 

25 will also be useful to raise anti-idiotypic antibodies. 
These will be useful in detecting or diagnosing various 
immunological conditions related to expression of the 
respective antigens. 

30 B . Immunoas says 

A particular protein can be measured by a variety of 
immunoassay methods. For a review of immunological and 
immunoassay procedures in general, see Stites and Terr 
(eds.) (1991) Basic and Clinical Immunology (7th ed.). 

35 Moreover, the immunoassays of the present invention can 
be performed in many configurations, which are reviewed 
extensively in Maggio (ed.) (1980) Enzvme Immunoassay CRC 

Press, Boca Raton, Florida; Tijan (1985) "Practice and 
Theory of Enzyme Immunoassays," Laborato ry Tech niques in 



WO 99/03993 



29 



PCI7US98/14544 



Biochemistry and Molecular Biology . Elsevier Science 

Publishers B.V., Amsterdam; and Harlow and Lane 
Antibodies, A Laboratory Manual , supra , each of which is 

incorporated herein by reference. See also Chan (ed.) 
5 (1987) Immunoassay: A Practical Guide Academic Press, 
Orlando, FL; Price and Newman (eds.) (1991) Principles 
and Practice of Immunoassays Stockton Press, NY; and Ngo 
(ed. ) (1988) Non-isotopic Immunoassays Plenum Press, NY. 
Immunoassays for measurement of SOCS or WDS proteins 

10 can be performed by a variety of methods known to those 
skilled in the art. In brief, immunoassays to measure 
the protein can be either competitive or noncompetitive 
binding assays. In competitive binding assays, the 
sample to be analyzed competes with a labeled analyte for 

15 specific binding sites on a capture agent bound to a 
solid surface. Preferably the capture agent is an 
antibody specifically reactive with SOCS or WDS proteins 
produced as described above. The concentration of 
labeled analyte bound to the capture agent is inversely 

20 proportional to the amount of free analyte present in the 
sample . 

In a competitive binding immunoassay, the SOCS or 
WDS protein present in the sample competes with labeled 
protein for binding to a specific binding agent, for 

25 example, an antibody specifically reactive with the SOCS 
or WDS protein. The binding agent may be bound to a 
solid surface to effect separation of bound labeled 
protein from the unbound labeled protein. Alternately, 
the competitive binding assay may be conducted in liquid 

30 phase and a variety of techniques known in the art may be 
used to separate the bound labeled protein from the 
unbound labeled protein. Following separation, the 
amount of bound labeled protein is determined. The 
amount of protein present in the sample is inversely 

35 proportional to the amount of labeled protein binding. 

Alternatively, a homogeneous immunoassay may be 
performed in which a separation step is not needed. In 
these immunoassays, the label on the protein is altered 
by the binding of the protein to its specific binding 
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agent. This alteration in the labeled protein results in 
a decrease or increase in the signal emitted by label, so 
that measurement of the label at the end of the 
immunoassay allows for detection or quantitation of the 
5 protein. 

Qualitative or quantitative analysis of SOCS or WDS 
proteins may also be determined by a variety of 
noncompetitive immunoassay methods. For example, a two- 
site, solid phase sandwich immunoassay may be used. In 

10 this type of assay, a binding agent for the protein, for 
example an antibody, is attached to a solid support. A 
second protein binding agent, which may also be an 
antibody, and which binds the protein at a different 
site, is labeled. After binding at both sites on the 

15 protein has occurred, the unbound labeled binding agent 
is removed and the amount of labeled binding agent bound 
to the solid phase is measured. The amount of labeled 
binding agent bound is directly proportional to the 
amount of protein in the sample. 

20 Western blot analysis can be used to determine the 

presence of SOCS or WDS proteins in a sample. 
Electrophoresis is carried out, for example, on a tissue 
sample suspected of containing the protein. Following 
electrophoresis to separate the proteins, and transfer of 

25 the proteins to a suitable solid support, e.g., a 

nitrocellulose filter, the solid support is incubated 
with an antibody reactive with the protein. This 
antibody may be labeled, or alternatively may be detected 
by subsequent incubation with a second labeled antibody 

30 that binds the primary antibody. 

The immunoassay formats described above employ 
labeled assay components. The label may be coupled 
directly or indirectly to the desired component of the 
assay according to methods well known in the art. A wide 

35 vaxiety of labels and methods may be used. 

Traditionally, a radioactive label incorporating ^H, 
125j / 35g^ 14c, or 32p used. Non-radioactive labels 
include proteins which bind to labeled antibodies, 
f luorophores , chemi luminescent agents, enzymes, and 

40 antibodies which can serve as specific binding pair 
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members for a labeled protein. The choice of label 
depends on sensitivity required, ease of conjugation with 
the compound, stability requirements, and available 
instrumentation. For a review of various labeling or 
5 signal producing systems which may be used, see U.S. 
Patent No. 4,391,904, which is incorporated herein by 
reference. 

Antibodies reactive with a particular protein can 
also be measured by a variety of immunoassay methods. 
10 For a review of immunological and immunoassay procedures 
applicable to the measurement of antibodies by 
immunoassay techniques, see Stites and Terr (eds.) Basic 
and Clinical Immunology (7th ed.) supra; Maggio (ed.) 

* 

Enzyme Immunoassay , supra; and Harlow and Lane 

15 Antibodies. A Laboratory Manual , supra. 

In brief, immunoassays to measure antisera reactive 
with SOCS or WDS proteins can be either competitive or 
noncompetitive binding assays. In competitive binding 
assays, the sample analyte competes with a labeled 

20 analyte for specific binding sites on a capture agent 

bound to a solid surface. Preferably the capture agent 
is a purified recombinant SOCS or WDS protein produced as 
described above. Other sources of these proteins, 
including isolated or partially purified naturally 

25 occurring protein, may also be used. Noncompetitive 
assays include sandwich assays, in which the sample 
analyte is bound between two analyte-specif ic binding 
reagents. One of the binding agents is used as a capture 
agent and is bound to a solid surface. The second 

30 binding agent is labeled and is used to measure or detect 
the resultant complex by visual or instrument means. A 
number of combinations of capture agent and labeled 
binding agent can be used. A variety of different 
immunoassay formats, separation techniques, and labels 

35 can be also be used similar to those described above for 
the measurement of SOCS or WDS proteins. 



40 



V. Making SOCS or WDS proteins; Mimetics 
DNAs which encode a SOCS or WDS protein or fragments 
thereof can be obtained by chemical synthesis, screening 
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cDNA libraries, or by screening genomic libraries 
prepared from a wide variety of cell lines or tissue 
samples. Methods for doing so, or making expression 
vectors are described herein, 
5 These DNAs can be expressed in a wide variety of 

host cells for the synthesis of a full-length protein or 
fragments which can in turn, e.g., be used to generate 
polyclonal or monoclonal antibodies; for binding studies; 
for construction and expression of modified molecules; 

10 and for structure/ function studies. Each SOCS or WDS 
protein or its fragments can be expressed in host cells 
that are transformed or transfected with appropriate 
expression vectors. By "transformed" is meant a cell 
into which (or into an ancestor of which) has been 

15 introduced, by means of recombinant techniques, a DNA 
molecule that encodes a SOCS or WDS polypeptide. 
Heterologous ly expressed SOCS or WDS polypeptides can be 
substantially purified to be free of protein or cellular 
contaminants, other than those derived from the 

20 recombinant host, and therefore are particularly useful 
in pharmaceutical compositions when combined with a 
pharmaceutical ly acceptable carrier and/or diluent. The 
antigen, e.g., SOCS or WDS protein, or portions thereof, 
may be expressed as fusions with other proteins or 

25 possessing an epitope tag. 

Expression vectors are typically self -replicating 
DNA or RNA constructs containing the desired antigen gene 
or its fragments, usually operably linked to appropriate 
genetic control elements that are recognized in a 

30 suitable host cell. The specific type of control 

elements necessary to effect expression will depend upon 
the eventual host cell used. Generally, the genetic 
control elements can include a prokaryotic promoter 
system or a eukaryotic promoter expression control 

35 system, and typically include a transcriptional promoter, 
an optional operator to control the onset of 
transcription, transcription enhancers to elevate the 
level of mRNA expression, a sequence that encodes a 
suitable ribosome binding site, and sequences that 

40 terminate transcription and translation. All of the 
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associated elements both necessary and sufficient for the 
production of SOCS or WDS polypeptide will be in operable 
linkage with the nucleic acid encoding a SOCS or WDS 
polypeptide- Expression vectors also usually contain an 
5 origin of replication that allows the vector to replicate 
independently from the host cell. 

The vectors of this invention contain DNAs which 
encode a SOCS or WDS protein, or a fragment thereof, 
typically encoding, e.g., a biologically active 

10 polypeptide, or protein. The DNA can be under the 

control of a viral promoter and can encode a selection 
marker. This invention further contemplates use of such 
expression vectors which are capable of expressing 
eukaryotic cDNA coding for a SOCS or WDS protein in a 

15 prokaryotic or eukaryotic host, where the vector is 

compatible with the host and where the eukaryotic cDNA 
coding for the protein is inserted into the vector such 
that growth of the host containing the vector expresses 
the cDNA in question. Usually, expression vectors are 

20 designed for stable replication in their host cells or 

for amplification to greatly increase the total number of 
copies of the desirable gene per cell . It is not always 
necessary to require that an expression vector replicate 
in a host cell, e.g., it is possible to effect transient 

25 expression of the protein or its fragments in various 
hosts using vectors that do not contain a replication 
origin that is recognized by the host cell. It is also 
possible to use vectors that cause integration of a SOCS 
or WDS protein gene or its fragments into the host DNA by 

30 recombination, or to integrate a promoter which controls 
expression of an endogenous gene. 

Vectors, as used herein, contemplate plasmids, 
viruses, bacteriophage, integratable DNA fragments, and 
other vehicles which enable the integration of DNA 

35 fragments into the genome of the host. Expression 

vectors are specialized vectors which contain genetic 
control elements that effect expression of operably 
linked genes. Plasmids are the most commonly used form 
of vector, but many other forms of vectors which serve an 

40 equivalent function are suitable for use herein. See, 
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e.g., Pouwels, et al. (1985 and Supplements) Cloning 
Vectors: A Laboratory Manual Elsgvigr. N.Y. ; and 
Rodriquez, et al. (eds.) (1988) Vectors: A Survey of 
Molecular Cloning Vectors and Their Uses Buttersworth, 
5 Boston, MA. 

Suitable host cells include prokaryotes, lower 
eukaryotes, and higher eukaryotes. Prokaryotes include 
both gram negative and gram positive organisms, e.g., E. 
coli and B. subtilis. Lower eukaryotes include yeasts, 

10 e.g., S. cerevisiae and Pichia, and species of the genus 
Dictyostelium. Higher eukaryotes include established 
tissue culture cell lines from animal cells, both of 
non-mammalian origin, e.g., insect cells, and birds, and 
of mammalian origin, e.g., human, primates, and rodents. 

15 Prokaryotic host-vector systems include a wide 

variety of vectors for many different species. As used 
herein, E. coli and its vectors will be used generically 
to include equivalent vectors used in other prokaryotes. 
A representative vector for amplifying DNA is pBR322 or 

20 its derivatives. Vectors that can be used to express 

these proteins or protein fragments include, but are not 
limited to, such vectors as those containing the lac 
promoter (pUC-series) ; trp promoter (pBR322-trp) ; Ipp 
promoter (the pIN-series) ; lambda -pP or pR promoters 

25 (pOTS) ; or hybrid promoters such as ptac (pDR540) . See 
Brosius, et al. (1988) "Expression Vectors Employing 
Lambda-, trp-, lac-, and Ipp-derived Promoters", in 
Rodriguez and Denhardt (eds.) Vectors: A Survey of 
Molecular Cloning Vectors and Their Uses 10:205-236 

30 Buttersworth, Boston, MA. 

Lower eukaryotes, e.g., yeasts and Dictyostelium, 
may be transformed with SOCS or WDS protein sequence 
containing vectors. For purposes of this invention, the 
most common lower eukaryotic host is the baker's yeast, 

35 Saccharomyces cerevisiae. It will be used generically to 
represent lower eukaryotes although a number of other 
strains and species are also available. Yeast vectors 
typically consist of a replication origin (unless of the 
integrating type) , a selection gene, a promoter, DNA 
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encoding the desired protein or its fragments, and 
sequences for translation termination, polyadenylation, 
and transcription termination. Suitable expression 
vectors for yeast include such constitutive promoters as 
5 3 -phosphoglycerate kinase and various other glycolytic 
enzyme gene promoters or such inducible promoters as the 
alcohol dehydrogenase 2 promoter or metal lothionine 
promoter. Suitable vectors include derivatives of the 
following types: self -replicating low copy number (such 
10 as the YRp-series) , self -replicating high copy number 

(such as the YEp-series) ; integrating types (such as the 
Yip-series) , or mini-chromosomes (such as the YCp- 
series) . 

Higher eukaryotic tissue culture cells are typically 
15 the preferred host cells for expression of the 

functionally active SOCS or WDS protein. In principle, 
many higher eukaryotic tissue culture cell lines may be 
used, e.g., insect baculovirus expression systems, 
whether from an invertebrate or vertebrate source. 
20 However, mammalian cells are preferred to achieve proper 
processing, both cotranslationally and 

posttranslationally. Transformation or transfection and 
propagation of such cells is routine. Useful cell lines 
include HeLa cells, Chinese hamster ovary (CHO) cell 

25 lines, baby rat kidney (BRK) cell lines, insect cell 
lines, bird cell lines, and monkey (COS) cell lines. 
Expression vectors for such cell lines usually include an 
origin of replication, a promoter, a translation 
initiation site, RNA splice sites (e.g., if genomic DNA 

30 is used) , a polyadenylation site, and a transcription 
termination site. These vectors also may contain a 
selection gene or amplification gene. Suitable 
expression vectors may be plasmids, viruses, or 
retroviruses carrying promoters derived, e.g., from such 

35 sources as from adenovirus, SV40, parvoviruses, vaccinia 
virus, or cytomegalovirus. Representative examples of 
suitable expression vectors include pCDNAl; pCD, see 
Okayama, et al. (1985) Mol. Cell Biol. 5:1136-1142; 
pMClneo Poly-A, see Thomas, et al. (1987) Cell 51:503- 

40 512; and a baculovirus vector such as pAC 373 or pAC 610. 
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It is likely that SOCS or WDS proteins need not be 
glycosylated to elicit biological responses. However, it 
will occasionally be desirable to express a SOCS or WDS 
protein polypeptide in a system which provides a specific 
5 or defined glycosylation pattern. In this case, the 
usual pattern will be that provided naturally by the 
expression system. However, the pattern will be 
modifiable by exposing the polypeptide, e.g., in 
unglycosylated form, to appropriate glycosylating 

10 proteins introduced into a heterologous expression 

system. For example, the SOCS or WDS protein gene may be 
co- trans formed with one or more genes encoding mammalian 
or other glycosylating enzymes. It is further understood 
that over glycosylation may be detrimental to SOCS or WDS 

15 protein biological activity, and that one of skill may 
perform routine testing to optimize the degree of 
glycosylation which confers optimal biological activity. 

Furthermore, heterologous ly expressed proteins or 
polypeptides can also be expressed in plant cells. For 

20 plant cells viral expression vectors (e.g., cauliflower 
mosaic virus and tobacco mosaic virus) and plasmid 
expression vectors (e.g., Tl plasmid) are suitable. Such 
cells are available from a wide range of sources (e.g., 
the American Tissue Type Culture Collection, Rockland, 

25 MD; also, see for example, Ausubel, et al. (cur. ed. and 
Supplements; expression vehicles may be chosen from those 
provided e.g., in Pouwels, et al. (Cur. ed..) Cloning 
Vectors, A Labpratory Manual) . 

A SOCS or WDS protein, or a fragment thereof, may be 
30 engineered to be phosphatidyl inositol (PI) linked to a 
cell membrane, but can be removed from membranes by 
treatment with a phosphatidyl inositol cleaving enzyme, 
e.g., phosphatidyl inositol phospholipase-C. This 
releases the antigen in a biologically active form, and 
35 allows purification by standard procedures of protein 

chemistry. See, e.g., Low (1989) Biochem. Biophvs. Acta 
988:427-454; Tse, et al. (1985) Science 230:1003-1008; 
and Brunner, et al. (1991) J- Cell Biol. 114:1275-1283. 
Now that SOCS or WDS proteins have been 
40 characterized, fragments or derivatives thereof can be 
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prepared by conventional processes for synthesizing 
peptides. These include processes such as are described 
in Stewart and Young (1984) Solid Ph ase Peptide Synthesis 

Pierce Chemical Co., Rockford, IL; Bodanszky and 
5 Bodanszky (1984) The Practice of Peptide Synthesis 

Springer-Verlag, New York, NY; and Bodanszky (1984) The 
Principles of Peptide Synthesis Springer-Verlag, New 
York, NY. For example, an azide process, an acid 
chloride process, an acid anhydride process, a mixed 
10 anhydride process, an active ester process (for example, 
p-nitrophenyl ester, N-hydroxysuccinimide ester, or 
cyanomethyl ester) , a carbodi imidazole process, an 
oxidative-reductive process, or a 

dicyclohexylcarbodiimide (DCCD) /additive process can be 

15 used. Solid phase and solution phase syntheses are both 
applicable to the foregoing processes. 

The prepared protein and fragments thereof can be 
isolated and purif ied from the reaction mixture by means 
of peptide separation, for example, by extraction, 

20 precipitation, electrophoresis and various forms of 

chromatography, and the like. The SOCS or WDS proteins 
of this invention can be obtained in varying degrees of 
purity depending upon its desired use. Purification can 
be accomplished by use of known protein purification 

25 techniques or by the use of the antibodies or binding 
partners herein described, e.g., in immunoabsorbant 
affinity chromatography. This immunoabsorbant affinity 
chromatography is carried out by first linking the 
antibodies to a solid support and then contacting the 

30 linked antibodies with solubilized lysates of appropriate 
source cells, lysates of other cells expressing the 
protein, or lysates or supernatants of cells producing 
the SOCS or WDS proteins as a result of recombinant DNA 
techniques, see below. 

35 Multiple cell lines may be screened for one .which 

expresses a SOCS or WDS protein at a high level compared 
with other cells. Various cell lines, e.g., a mouse 
thymic stromal cell line TA4, is screened and selected 
for its favorable handling properties. Natural SOCS or 

40 WDS proteins can be isolated from natural sources, or by 
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expression from a transformed cell vising an appropriate 

4 

expression vector. Purification of the expressed protein 

is achieved by standard procedures, or may be combined 

with engineered means for effective purification at high 

5 efficiency from cell lysates or supernatants . Epitope or 
other tags, e.g., FLAG or Hiss segments, can be used for 

such purification features. 

VI. Physical Variants 

10 This invention also encompasses proteins or peptides 

having substantial amino acid sequence similarity with an 
amino acid sequence of a SOCS or WDS protein. Natural 
variants include individual, polymorphic, allelic, 
strain, or species variants. 

15 Amino acid sequence similarity, or sequence 

identity, is determined by optimizing residue matches, if 
necessary, by introducing gaps as required. This changes 
when considering conservative substitutions as matches. 
Conservative substitutions typically include 

20 substitutions within the following groups: glycine, 
alanine; valine, isoleucine, leucine; aspartic acid, 
glutamic acid; asparagine, glutamine; serine, threonine; 
lysine, arginine; and phenylalanine, tyrosine. 
Homologous amino acid sequences include natural 

25 polymorphic, allelic, and interspecies variations in each 
respective protein sequence. Typical homologous proteins 
or peptides will have, from 50-100% similarity (if gaps 
can be introduced), to 75-100% similarity (if 
conservative substitutions are included) over fixed 

30 stretches of amino acids with the amino acid sequence of 
the SOCS or WDS protein. Similarity measures will be at 
least about 50%, generally at least 65%, usually at least 
70%, preferably at least 75%, and more preferably at 
least 90%, and in particularly preferred embodiments, at 

35 least 96% or more. See also Needleham, et al. (1970) 

Mol. Biol. 48:443-453; Sankoff, et al. (1983) Time Warps, 
String Edits, and Macromolecules : The Theory and Practice 
of Sequence Comparison Chapter One, Addison-Wesley, 
Reading, MA; and software packages from IntelliGenetics, 

40 Mountain View, CA; and the University of Wisconsin 
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Genetics Computer Group, Madison, WI. Stretches of amino 
acids will be at least about 10 amino acids, usually 
about 20 amino acids, usually 50 amino acids, preferably 
75 amino acids, and in particularly preferred embodiments 

.5 at least about 100 amino acids. Identity can also be 
measures over amino acid stretches of about 98, 99, 110, 
120, 130, etc. 

Nucleic acids encoding mammalian SOCS or WDS 
proteins will typically hybridize to the nucleic acid 

10 sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 13, or 15 under 
stringent conditions. For example, nucleic acids 
encoding human SOCS or WDS proteins will normally 
hybridize to the nucleic acid of SEQ ID NO: 1, 3, 5, 7, 
9, 11, 13, or 15 under stringent hybridization 

15 conditions. Generally, stringent conditions are selected 
to be about 10° C lower than the thermal melting point 
(Tm) for the probe sequence at a defined ionic strength 
and pH. The Tm is the tempera ture (under defined ionic 
strength and pH) at which 50% of the target sequence 

20 hybridizes to a perfectly matched probe. Typically, 
stringent conditions will be those in which the salt 
concentration is about 0.2 molar at pH 7 and the 
temperature is at least about 50° C. Other factors may 
significantly affect the stringency of hybridization, 

25 including, among others, base composition and size of the 
complementary strands, the presence of organic solvents 
such as formamide, and the extent of base mismatching. A 
preferred embodiment will include nucleic acids which 
will bind to disclosed sequences in 50% formamide and 200 

30 rriM NaCl at 42° C. 

Hybridizing nucleic acids to SOCS nucleic acid of 
the invention can be used as a cloning probe, a primer 
(e.g., a PCR primer), or a diagnostic probe. Hybridizing 
nucleic acids can be splice variants encoded by one of 

35 the SOCS genes described herein. Thus, the hybridizing 

nucleic acids may encode a polypeptide that is shorter or 
longer than the various forms of SOCS described herein. 
Hybridizing nucleic acids may also encode proteins that 
are related to SOCS (e.g., polypeptides encoded by genes 
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that include a portion having a relatively high degree of 
identity to a SOCS gene described herein) . 

An isolated SOCS or WDS protein encoding DNA can be 
readily modified by nucleotide substitutions, nucleotide 
5 deletions, nucleotide insertions, and short inversions of 
nucleotide stretches. These modifications result in 
novel DNA sequences which encode SOCS or WDS protein 
antigens, their derivatives, or proteins having highly 
similar physiological, immunogenic, or antigenic 

10 activity. 

Modified sequences can be used to produce mutant 

antigens or to enhance expression. Enhanced expression 

may involve gene amplification, increased transcription, 

increased translation, and other mechanisms. Such mutant 

15 SOCS or WDS protein derivatives include predetermined or 
site-specific mutations of the respective protein or its 
fragments. "Mutant SOCS or WDS protein" encompasses a 
polypeptide otherwise falling within the homology 
definition of the human or rodent SOCS or WDS protein as 

20 set forth above, but having an amino acid sequence which 
differs from that of a SOCS or WDS protein as found in 
nature, whether by way of deletion, substitution, or 
insertion. In particular, "site specific mutant SOCS or 
WDS protein" generally includes proteins having 

25 significant similarity with a protein having a sequence 
of SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, or 16, e.g., 
natural embodiments, and as sharing various biological 
activities, e.g., antigenic or immunogenic, with those 
sequences, and in preferred embodiments contain most or 

30 all of the disclosed sequence. This applies also to 

polymorphic variants from different individuals. Similar 
concepts apply to different SOCS or WDS proteins, 
particularly those found in various warm blooded animals, 
e.g., mammals and birds. As stated before, it is 

35 emphasized that descriptions are generally meant .to 

encompass other SOCS or WDS proteins, not limited to the 
human embodiments specifically discussed. 

The invention encompasses, but is not limited to, 
SOCS proteins and polypeptides that are functionally 
40 related to SOCS encoded by the specific sequence 
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identifiers of the present application. Functionally 
related proteins and polypeptides include any protein or 
polypeptide sharing a functional characteristic with SOCS 
of the present invention e.g., the ability to interact 
5 with Janus family tyrosine kinases or the ability to be 
induced by IL-2 receptor activation. Such functionally 
related SOCS polypeptides include, but are not limited 
to, additions or substitutions of amino acid residues 
within the amino acid sequence encoded by the SOCS 
10 sequences described herein which result in a silent 

change, thus producing a functionally equivalent SOCS 
polypeptide. Amino acid substitutions may be made on the 
basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphiphatic 
15 nature of the residues involved. 

For example, nonpolar (hydrophobic) amino acids 
include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral 
amino acids include glycine, serine, threonine, cysteine, 
20 tyrosine, asparagine, and glutamine; positively charged 
(basic) amino acids include arginine, lysine, and 
histidine; and negatively charged (acidic) amino acids 
include aspartic acid and glutamic acid. 

While random mutations can be made to SOCS nucleic 
25 acid (using well known random mutagenesis techniques) and 
the resulting SOCS polypeptides can be tested for 
activity, site-directed mutations of SOCS coding 
sequences can be engineered (using well known site- 
directed mutagenesis techniques) to generate mutant SOCS 
30 with increased function, e.g. greater inhibition of JANUS 
kinase activity or greater resistance to degradation. 

To design functionally related and functionally 
variant SOCS polypeptides, it is useful to distinguish 
between conserved and variable amino residues using the 
35 homology comparison tables provided herein. 

To preserve SOCS function, it is preferable that 
conserved residues remain unaltered and that the 
conformational folding of the SOCS functional sites be 
preserved. Preferably, alteration of non-conserved 
40 residues are carried out with conservative alterations 
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e.g., a basic amino acid is replaced by a different basic 
amino acid. To produce altered function variants, it is 
preferred to make non-conservative changes at variable 
and or conserved residues. Deletions at conserved and 
5 variable residues can also be used to create altered 
function variants. 

Although site specific mutation sites are 
predetermined, mutants need not be site specific. SOCS 
or WDS protein mutagenesis can be conducted by making 

10 amino acid insertions or deletions. Substi tut ions, 
deletions, insertions, or any combinations may be 
generated to arrive at a final construct. Insertions 
include amino- or carboxyl- terminal fusions, e.g. 
epitope tags. Random mutagenesis can be conducted at a 

15 target codon and the expressed mutants can then be 

screened for the desired activity. Methods for making 
substitution mutations at predetermined sites in DNA 
having a known sequence are well known in the art, e.g., 
by M13 primer mutagenesis or polymerase chain reaction 

20 (PCR) techniques. See also, Sambrook, et al. (1989) and 
Ausubel, et al. (1987 and Supplements) . The mutations in 
the DNA normally should not place coding sequences out of 
reading frames and preferably will not create 
complementary regions that could hybridize to produce 

25 secondary mRNA structure such as loops or hairpins. 

The present invention also provides recombinant 
proteins, e.g., heterologous fusion proteins using 
segments from these proteins. A heterologous fusion 
protein is a fusion of proteins or segments which axe 

30 naturally not normally fused in the same manner e.g., a 
marker polypeptide or fusion partner. For example, the 
polypeptide can be fused to a hexa-histidine tag to 
facilitate purification or bacterially expressed protein 
or a hemaglutinin tag to facilitate purification or 

35 protein expressed in eukaryotic cells. Thus, the fusion 
product of an immunoglobulin with a SOCS or WDS protein 
polypeptide is a continuous protein molecule having 
sequences fused in a typical peptide linkage, typically 
made as a single translation product and exhibiting 
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properties derived from each source peptide. A similar 
concept applies to heterologous nucleic acid sequences. 

In addition, new constructs may be made from 
combining similar functional domains from other proteins. 
5 For example, protein-binding or other segments may be 
"swapped" between different new fusion polypeptides or 
fragments. See, e.g., Cunningham, et al. (1989) Science 
243:1330-1336; and O'Dowd, et al. (1988) J. Biol. Chem. 
263:15985-15992. Thus, new chimeric polypeptides 
10 exhibiting new combinations of specificities will result 
from the functional linkage of protein-binding 
specificities and other functional domains. 

VII. Functional Variants 

15 The blocking of physiological response to SOCS or 

WDS protein may result from the inhibition of binding of 
the protein to its binding partner, e.g., through 
competitive inhibition. Thus, in vitro assays of the 
present invention will often use isolated protein, 

20 membranes from cells expressing a recombinant membrane 
associated SOCS or WDS protein, soluble fragments 
comprising binding segments of these proteins, or 
fragments attached to solid phase substrates. These 
assays will also allow for the diagnostic determination 

25 of the effects of either binding segment mutations and 
modifications, or protein mutations and modifications, 
e.g., protein analogs. This invention also contemplates 
the use of competitive drug screening assays, e.g., where 
neutralizing antibodies to antigen or binding partner 

30 fragments compete with a test compound for binding to the 
protein. In this manner, the antibodies can be used to 
detect the presence of a polypeptide which shares one or 
more antigenic binding sites of the protein and can also 
be used to occupy binding sites on the protein that might 

35 otherwise interact with a binding partner. 

"Derivatives" of SOCS or WDS protein antigens 
include amino acid sequence mutants, glycosylation 
variants, and covalent or aggregate conjugates with other 
chemical moieties. Covalent derivatives can be prepared 

40 by linkage of functionalities to groups which are found 
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in SOCS or WDS protein amino acid side chains or at the 
N- or C- termini, by means which are well known in the 
art. These derivatives can include, without limitation, 
aliphatic esters or amides of the carboxyl terminus, or 
5 of residues containing carboxyl side chains, O-acyl 

derivatives of hydroxyl group-containing residues, and N- 
acyl derivatives of the amino terminal amino acid or 
amino-group containing residues, e.g., lysine or 
arginine. Acyl groups are selected from the group of 

10 alkyl -moieties including C3 to C18 normal alkyl, thereby 
forming alkanoyl aroyl species. Covalent attachment to 
carrier proteins may be important when immunogenic 
moieties are haptens. 

In particular, glycosylation alterations are 

15 included, e.g., made by modifying the glycosylation 
patterns of a polypeptide during its synthesis and 
processing, or in further processing steps. Particularly 
preferred means for accomplishing this are by exposing 
the polypeptide to glycosylating enzymes derived from 

20 cells which normally provide such processing, e.g., 

mammalian glycosylation enzymes. Deglycosylation enzymes 
are also contemplated. Also embraced are versions of the 
same primary amino acid sequence which have other minor 
modifications, including phosphorylated amino acid 

25 residues, e.g., phospho tyro sine, phospho serine, or 

phospho threonine, or other moieties, including ribosyl 
groups or cross-linking reagents. 

A major group of derivatives are covalent conjugates 
of the SOCS or WDS protein or fragments thereof with 

30 other proteins or polypeptides. These derivatives can be 
synthesized in recombinant culture such as N- or C- 
terminal fusions or by the use of agents known in the art 
for their usefulness in cross-linking proteins through 
reactive side groups. Preferred protein derivatization 

35 sites with cross-linking agents are at free amino groups, 
carbohydrate moieties, and cysteine residues. 

Fusion polypeptides between SOCS or WDS protein and 
other homologous or heterologous proteins are also 
provided. Heterologous polypeptides may be fusions 

40 between different surface markers, resulting in, e.g., a 
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hybrid protein exhibiting binding partner specificity. 

Likewise, heterologous fusions may be constructed which 

would exhibit a combination of properties or activities 

of the derivative proteins. Typical examples are fusions 

5 of a reporter polypeptide, e.g., luciferase, with a 

segment or domain of a protein, e.g., a segment involved 

in binding partner interaction, so that the presence or 

location of the fused protein may be easily determined. 

See, e.g., Dull, et al., U.S. Patent No. 4,859,609. 
10 Other gene fusion partners include bacterial p- 

galactosidase, trpE, Protein A, £- lactamase, alpha 
amylase, alcohol dehydrogenase, and yeast alpha mating 
factor. See, e.g., Godowski, et al. (1988) Science 

241:812-816. The fusion partner can be constructed such 

15 that it can be cleaved off such that a protein of 
substantially natural length is generated. 

Such polypeptides may also have amino acid residues 
which have been chemically modified by phosphorylation, 
sulfonation, biotinylation, or the addition or removal of 

20 other moieties, particularly those which have molecular 
shapes similar to phosphate groups. In some embodiments, 
the modifications will be useful labeling reagents, or 
serve as purification targets, e.g., affinity proteins. 
This invention also contemplates the use of 

25 derivatives of SOCS or WDS protein other than variations 
in amino acid sequence or glycosylation. Such 
derivatives may involve covalent or aggregative 
association with chemical moieties. These derivatives 
generally fall into the three classes: (1) salts, (2) 

30 side chain and terminal residue covalent modifications, 
and (3) adsorption complexes, for example with cell 
membranes. Such covalent or aggregative derivatives are 
useful as immunogens, as reagents in immunoassays, or in 
purif ication methods such as for affinity purification of 

35 proteinss or other binding proteins. For example, a SOCS 
or WDS protein antigen can be immobilized by covalent 
bonding to a solid support such as cyanogen bromide- 
activated SEPHAROSE, by methods which are well known in 
the art, or adsorbed onto polyolefin surfaces, with or 

40 without glutaraldehyde cross -linking, for use in the 
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assay or purification of anti-SOCS or anti-WDS protein 
antibodies or its respective binding partner . The SOCS 
or WDS protein can also be labeled with a detectable 
group, e.g., radioiodinated by the chloramine T 
5 procedure, covalently bound to rare earth chelates, or 
conjugated to another fluorescent moiety for use in 
diagnostic assays. Purification of SOCS or WDS proteins 
may be effected by immobilized antibodies or binding 
partner . 

10 Isolated SOCS or WDS protein genes will allow 

transformation of cells lacking expression of 
corresponding SOCS or WDS protein, e.g., either species 
types or cells which lack corresponding proteins and 
exhibit negative background activity. Expression of 

15 transformed genes will allow isolation of antigenically 
pure cell lines, with defined or single specie variants. 
This approach will allow for more sensitive detection and 
discrimination of the physiological effects of SOCS or 
WDS binding proteins. Subcellular fragments, e.g., 

20 cytoplasts or membrane fragments, can be isolated and 
used. 

VIII. Binding Agent: SOCS or :WDS Protein Complexes 

A SOCS or WDS protein that specifically binds to or 
that is specifically immunoreactive with an antibody 
generated against a defined immunogen, such as an 

■ 

immunogen consisting of the amino acid sequence of SEQ ID 
NO: 2, 4, 6, 8, 10, 12, 14, or 16 is typically determined 
in an immunoassay. The immunoassay uses a polyclonal 
antiserum which was raised to a protein of SEQ ID NO: 2, 
4, 6, 8, 10, 12, 14, or 16. This antiserum is selected 
to have low crossreactivity against other intracellular 
regulatory proteins and any such crossreactivity is 
removed by immunoabsorbtion prior to use in the 

immunoas say . 

In order to produce antisera for use in an 
immunoassay, the protein of desired sequence, e.g., SEQ 
ID NO: 2, 4, 6, 8, 10, 12, 14,. and/or 16, is isolated as 
described herein. For example, recombinant protein may 
be produced in a mammalian cell line. An inbred strain 
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of mice such as Balb/c is immunized with the protein of 
appropriate sequence using a standard adjuvant, such as 
Freund's adjuvant, and a standard mouse immunization 
protocol (see Harlow and Lane, supra) . Alternatively, a 
5 synthetic peptide, preferably near full length, derived 
from the sequences disclosed herein and conjugated to a 
carrier protein can be used an immunogen. Polyclonal 
sera are collected and titered against the immunogen 
protein in an immunoassay, for example, a solid phase 

10 immunoassay with the immunogen immobilized on a solid 
support. Polyclonal antisera with a titer of 10^ or 
greater are selected and tested for their cross 
reactivity against other intracellular proteins, using a 
competitive binding immunoassay such as the one described 

15 in Harlow and Lane, supra, at pages 570-573. Preferably 
two intracellular proteins are used in this determination 
in conjunction with the desired SOCS or WDS protein. 

Immunoassays in the competitive binding format can 
be used for the crossreactivity determinations. For 

20 example, a protein of SEQ ID NO: 2 or 4 can be 

immobilized to a solid support. Proteins added to the 
assay compete with the binding of the antisera to the 
. immobilized antigen. The ability of the above proteins 
to compete with the binding of the antisera to the 

25 immobilized protein is compared to the protein of SEQ ID 

* 

NO: 2 or 4. The percent crossreactivity for the above 
proteins is calculated, using standard calculations. 
Those antisera with less than 10% crossreactivity with 
each of the proteins listed above are selected and 

30 pooled. The cross-reacting antibodies are then removed 
from the pooled antisera by immunoabsorbtion with the 
above-listed proteins. 

The immunoabs orbed and pooled antisera are then used 
in a competitive binding immunoassay as described above 

35 to compare a second protein to the immunogen protein 

(e.g., the S0CS14 or SOCS15 protein of SEQ ID NO: 2 and 
6, or 4) . In order to make this comparison, the two 
proteins are each assayed at a wide range of 
concentrations and the amount of each protein required to 

40 inhibit 50% of the binding of the antisera to the 
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immobilized protein is determined. If the amount of the 
second protein required is less than twice the amount of 
the protein, e.g., of SEQ ID NO: 2 that is required, then 
the second protein is said to specifically bind to an 
5 antibody generated to the immunogen. 

It is understood that each of SOCS or WDS proteins 
are members of respective families of homologous proteins 
that comprise two or more genes. For a particular gene 
product, such as the human SOCS14 or SOCS15 protein, the 

10 term refers not only to the amino acid sequences 

disclosed herein, but also to other proteins that are 
polymorphic, allelic, non-allelic, or species variants. 
It is also understood that the term "SOCS14 or SOCS15 
protein" includes nonnatural mutations introduced by 

15 deliberate mutation using conventional recombinant 

technology such as single site mutation, or by excising 
short sections of DNA encoding SOCS14 or SOCS15 proteins, 
or by substituting new amino acids, or adding new amino 
acids. Such minor alterations should substantially 

20 maintain the immunoidentity of the original molecule 

and/or its biological activity. Thus, these alterations 
include proteins that are specifically immunoreactive 
with a designated naturally occurring SOCS or WDS 
protein, for example, the human SOCS14 or SOCS15 protein 

25 shown in SEQ ID NO: 2 and 6, or 4 and 8. The biological 
properties of the altered proteins can be determined by 
expressing the protein in an appropriate cell line and 
measuring, e.g., a proliferative effect. Particular 
protein modifications considered minor would include 

30 conservative substitution of amino acids with similar 

chemical properties, as described above for the S0CS14 or 
SOCS15 protein as a whole. By aligning a protein 
optimally with the protein of SEQ ID NO: 2, 4, 6, or 8, 
and by using the conventional immunoassays described 

35 herein to determine immunoidentity, or by using • 

proliferative assays, one can determine the protein 
compositions of the invention. 
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IX . Uses 

The present invention provides reagents which will 
find use in diagnostic applications as described 
elsewhere herein, e.g., in the general description for 
5 developmental abnormalities, or below in the description 
of kits for diagnosis. Moreover, the SOCS proteins can 
block signaling via cytokine receptors. 

SOCS or WDS nucleotides, e.g., human SOCS14 or 
SOCS15 DNA or RNA, may be used as a component in a 

10 forensic assay. For instance, the nucleotide sequences 
provided may be labeled using, e.g., 32 P or biotin and 
used to probe standard restriction fragment polymorphism 
blots, providing a measurable character to aid in 
distinguishing between individuals. Such probes may be 

15 used in well-known forensic techniques such as genetic 

fingerprinting. In addition, nucleotide probes made from 
SOCS or WDS sequences may be used in in situ assays to 
detect chromosomal abnormalities. For instance, 
rearrangements in the human chromosome encoding a S0CS14 

20 or SOCS15 gene may be detected via well-known in situ 

techniques, using SOCS14 or SOCS15 probes in conjunction 
with other known chromosome markers. 

Antibodies and other binding agents directed towards 
SOCS or WDS proteins or nucleic acids may be used to 

25 purify the corresponding SOCS or WDS molecule. As 

described in the Examples below, antibody purification of 
SOCS or WDS protein components is both possible and 
practicable. Antibodies and other binding agents may 
also be used in a diagnostic fashion to determine whether 

30 SOCS or WDS protein components are present in a tissue 
sample or cell population using well-known techniques 
described herein. The ability to attach a binding agent 
to a SOCS or WDS protein provides a means to diagnose 
disorders associated with SOCS or WDS protein 

35 misregulation. Antibodies and other SOCS or WDS .protein 
binding agents may also be useful as histological 
markers. It is likely that specific SOCS or WDS protein 
expression is limited to specific tissue types. By 
directing a probe, such as an antibody or nucleic acid to 

40 a SOCS14 or SOCS15 protein it is possible to use the 
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probe to distinguish tissue and cell types in situ or in 
vitro. 

This invention also provides reagents with 
significant therapeutic value. The SOCS or WDS protein 
5 (naturally occurring or recombinant), fragments thereof, 
and antibodies thereto, along with compounds identified 
as having binding affinity to a SOCS or WDS protein, are 
useful in the treatment of conditions associated with 
abnormal physiology or development, including abnormal 

10 proliferation, e.g., cancerous conditions, or 

degenerative conditions. Abnormal proliferation, 
regeneration, degeneration, and atrophy may be modulated 
by appropriate therapeutic treatment using the 
compositions provided herein. For example, a disease or 

15 disorder associated with abnormal expression or abnormal 
signaling by a SOCS or WDS protein is a target for an 
agonist or antagonist of the protein. The proteins 
likely play a role in regulation or development of 
neuronal or hematopoietic cells, e.g., lymphoid cells, 

20 which affect immunological responses. 

For example, SOCS or WDS proteins likely play a role 
in T cell activation deficiencies in which patients 
develop clinical manif estations of T cell 
immunodeficiency such as opportunistic infections, 

25 recurrent viral or bacterial infections, diarrhea, 
autoimmune hemolytic anemia, lymphoid hepatitis and 
dermatitis, and Hodgkin lymphoma, at various stages of 
childhood. An excess of SOCS proteins might lead to 
SCID-like (severe combined immunodeficiencies) syndromes 

30 while a deficit of SOCS or WDS proteins may lead to 
malignant growth, for example, adult T cell 
leukemia/ lymphoma is a disease associated with 
uncontrolled T-cell proliferation and is correlated at 
the molecular level with the presence of the IL-2 

35 receptor (Schechter, G.P.; "Chronic Lymphocytic Leukemia" 
in Clinical Immunology: Principles and Practice . Rich 

(ed.) Mosby, St. Louis (Curr. ed.)). A model for adult T 
cell leukemia suggests that the disease may result from 
constitutive activation of the IL-2 receptor and its 
40 subsequent constitutive signaling cascade. 



WO 99/03993 



51 



PCT/US98/14544 



Administration of exogenous SOCS to effected T cells may 
modulate this disease. 

Other abnormal developmental conditions are known in 
cell types shown to possess SOCS or WDS protein mRNA by 
5 northern blot analysis. See Berkow (ed.) The Merck 
Manual of Diagnosis and Therapy , Merck & Co., Rahway, 
N. J. ; Thorn et al. Harrison's Principles of Internal 
Medicine , McGraw-Hill, N.Y.; and Rich (ed. ) Clinical 
Immunology: Principles and Practice . Mosby, St. Louis 

10 (Curr. ed.). Developmental or functional abnormalities, 
e.g., of the neuronal or immune system, cause significant 
medical abnormalities and conditions which may be 
susceptible to prevention or treatment using compositions 
provided herein. 

15 Recombinant SOCS or WDS protein or SOCS or WDS 

antibodies can be purified and then administered to a 
patient. These reagents can be combined for therapeutic 
use with additional active or inert ingredients, e.g., in 
conventional pharmaceutical^ acceptable carriers or 

20 diluents, e.g., immunogenic adjuvants, along with 

physiologically innocuous stabilizers and excipients. 
These combinations can be sterile filtered and placed 
into dosage forms as by lyophilization in dosage vials or 
storage in stabilized aqueous preparations. This 

25 invention also contemplates use of antibodies or binding 
fragments thereof, including forms which are not 
complement binding. 

Drug screening using antibodies or fragments thereof 
can identify compounds having binding affinity to SOCS or 

30 WDS protein, including isolation of associated 

components. Subsequent biological assays can then be 
utilized to determine if the compound has intrinsic 
stimulating activity and is therefore a blocker or 
antagonist in that it blocks the activity of the protein. 

35 Likewise, a compound having intrinsic stimulating 

activity can activate the binding partner and is thus an 
agonist in that it simulates the activity of a SOCS or 
WDS protein. This invention further contemplates the 
therapeutic use of antibodies to SOCS or WDS protein as 



WO 99/03993 



52 



PCMJS98/14544 



antagonists. This approach should be particularly useful 
with other SOCS or WDS protein species variants. 

Another therapeutic approach included within the 
invention involves direct administration of reagents or 
5 compositions by any conventional administration 

techniques (for example but not restricted to local 
injection, inhalation, or administered systemically) , 
to the subject with an immune, allergic or trauma 
disorder. The reagents, formulations or compositions 

10 included within the bounds and metes of the invention may 
also be targeted to specific cells by any of the methods 
described herein. The actual dosage of reagent, 
formulation or composition that modulates an immune, 
disorder depends on many factors, including the size and 

15 health of an organism, however one of one of ordinary 
skill in the art can use the following teachings 
describing the methods and techniques for determining 
clinical dosages. Spilker (1984) Guide to Clinical 
Studies and Developing Protocols. Raven Press Books, 

20 Ltd., New York, pp. 7-13, .54-60; Spilker (1991) Guide to 
Clinical Trials, Raven Press, Ltd., New York, pp. 93-101; 
Craig and Stitzel (eds. 1986) Modern Pharmacology . 2d 

ed. , Little, Brown and Co., Boston, pp. 127-33; Speight 
(ed. 1987) Averv's Drug Treatment: Principles and 
25. Practice of Clinical Pharmacology and Therapeutics . 3d 

ed. , Williams and Wilkins, Baltimore, pp. 50-56; 
Tallarida, et al. (1988) Principles in General 
Pharmacology . Springer -Ver lag, New York, pp. 18-20) to 

determine the appropriate dosage to use; but, generally, 
30 in the range of about between 0.5 fg/ml and 500 Jig /ml 

inclusive final concentration are administered per day to 
an adult in any pharmaceutically-acceptable carrier. 

The quantities of reagents necessary for effective 
therapy will depend upon many different factors, 

35 including means of administration, target site, 

physiological state of the patient, and other medicants 
administered. Thus, treatment dosages should be titrated 
to optimize safety and efficacy. Typically, dosages 
used in vitro may provide useful guidance in the amounts 

40 useful for in situ administration of these reagents. 
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Animal testing of effective doses for treatment of 
particular disorders will provide further predictive 
indication of human dosage. Various considerations are 
described, e.g., in Gilman, et al. (eds.) (1990) Goodman 
5 and Gilman 1 s: The Pharmacological Bases of Therapeutics 
(8th ed.) Pergamon Press; and (1990) Remington 1 s 
Pharmaceutical Sciences (17th ed.) Mack Publishing Co., 
Easton, PA. Methods for administration are discussed 
therein and below, e.g., for oral, intravenous, 

10 intraperitoneal, or intramuscular administration, 

transdermal diffusion, and others. Pharmaceutically 
acceptable carriers will include water, saline, buffers, 
and other compounds described, e.g., in the Merck Index , 
Merck & Co., Rahway, NJ. Dosage ranges would ordinarily 

15 be expected to be in amounts lower than 1 mM 

concentrations, typically less than about 10 |JM 
concentrations, usually less than about 100 nM, 
preferably less than about 10 pM (picomolar) , and most 
preferably less than about 1 fM (femtomolar) , with an 

20 appropriate carrier. Slow release formulations, or a 
slow release apparatus will often be utilized for 
continuous administration. 

SOCS or WDS protein, fragments thereof, and 
antibodies to it or its fragments, antagonists, and 

25 agonists, may be administered directly to the host to be 
treated or, depending on the size of the compounds, it 
may be desirable to conjugate them to carrier proteins 
such as ovalbumin or serum albumin prior to their 
administration. Therapeutic formulations may be 

30 administered in any conventional dosage formulation. 
While it is possible for the active ingredient to be 
administered alone, it is preferable to present it as a 
pharmaceutical formulation. Formulations typically 
comprise at least one active ingredient, as defined 

35 above, together with one or more acceptable carriers 

thereof. Each carrier should be both pharmaceutically 
and physiologically acceptable in the sense of being 
compatible with the other ingredients and not injurious 
to the patient. Formulations include those suitable for 
40 oral, rectal, nasal, or parenteral (including 
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subcutaneous, intramuscular, intravenous and intradermal) 
administration. The formulations may conveniently be 
presented in unit dosage form and may be prepared by any 
methods well known in the art of pharmacy. See, e.g., 
5 Gilman, et al. (eds.) (1990) Goodman and Gilman's: The 
Pharmacological Bases of Therapeutics (8th ed.) Pergamon 
Press; and (1990) Remington's Pharmaceutical Sciences 

(17th ed.) Mack Publishing Co., Easton, PA; Avis, et al. 
(eds.) (1993) Pharmaceutical Dosage Forms: Parenteral 

10 Medications Dekker, NY; Lieberman, et al. (eds.) (1990) 
Pharmaceutical Dosaae Forms: Tablets Dekker, NY; and 
Lieberman, et al. (eds.) (1990) Pharmaceutical Dosage 
Forms: Disperse Systems Dekker, NY. The therapy of this 
invention may be combined with or used in association 

15 with other therapeutic agents. 

Both the naturally occurring and the recombinant 
forms of the SOCS or WDS proteins of this invention are 
particularly useful in kits and assay methods which are 
capable of screening compounds for binding activity to 

20 the proteins. Several methods of automating assays have 
been developed in recent years so as to permit screening 
of tens of thousands of compounds in a short period. 
See, e.g., Fodor, et al. (1991) Science 251:767-773, and 

other descriptions of chemical diversity libraries, which 

25 describe means for testing of binding affinity by a 
plurality of compounds. The development of suitable 
assays can be greatly facilitated by the availability of 
large amounts of purified, soluble SOCS or WDS protein as 
provided by this invention. 

30 For example, antagonists can normally be found once 

the protein has been structurally defined. Testing of 
potential protein analogs is now possible upon the 
development of highly automated assay methods using a 
purified binding partner. In particular, new agonists 

35 and antagonists will be discovered by using screening 
techniques described herein. Of particular importance 
are compounds found to have a combined binding affinity 
for multiple SOCS or WDS protein binding components, 
e.g./ compounds which can serve as antagonists for 

40 species variants of a SOCS or WDS protein. 
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This invention is* particularly useful for screening 
compounds by using recombinant protein in a variety of 
drug screening techniques. The advantages of using a 
recombinant protein in screening for specific binding 
5 partners include: (a) improved renewable source of the 
SOCS or WDS protein from a specific source; (b) 
potentially greater number of binding partners per cell 
giving better signal to noise ratio in assays; and (c) 
species variant specificity (theoretically giving greater 

10 biological and disease specificity) . 

One method of drug screening utilizes eukaryotic or 
prokaryotic host cells which are stably transformed with 
recombinant DNA molecules expressing a SOCS or WDS 
protein binding counterpart. Cells may be isolated which 

15 express a binding counterpart in isolation from any 

others. Such cells, either in viable or fixed form, can 
be used for standard protein binding assays. See also, 
Parce, et al. (1989) Science 246:243-247; and Owicki, et 
al. (1990) Proc. Nat'l Acad. Sci. USA 87:4007-4011, which 

20 describe sensitive methods to detect cellular responses. 
Competitive assays are particularly useful, where the 
cells (source of SOCS14 or SOCS15 protein) are contacted 
and incubated with a labeled binding partner or antibody 
having known binding affinity to the protein, such as 

25 125i_ an tibody, and a test sample whose binding affinity 
to the binding composition is being measured. The bound 

r 

and free labeled binding compositions are then separated 
to assess the degree of protein binding. The amount of 
test compound bound is inversely proportional to the 

30 amount of labeled binding partner binding to the known 
source. Any one of numerous techniques can be used to 
separate bound from free protein to assess the degree of 
protein binding. This separation step could typically 
involve a procedure such as adhesion to filters followed 

35 by washing, adhesion to plastic followed by washing, or 

centrifugation of the cell membranes. Viable cells could 
also be used to screen for the effects of drugs on SOCS 
or WDS protein mediated functions, e.g., second messenger 
levels, i.e., cell proliferation; inositol phosphate pool 

40 changes, transcription using a luciferase-type assay; and 
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others. Some detection methods allow for elimination of 
a separation step, e.g., a proximity sensitive detection 
system. 

Another method utilizes membranes from transformed 
5 eukaryotic or prokaryotic host cells as the source of a 
SOCS or WDS protein. These cells are stably transformed 
with DNA vectors directing the expression of a SOCS or 
WDS protein, e.g., an engineered membrane bound form. 
Essentially, the membranes would be prepared from the 

10 cells and used in a protein binding assay such as the 
competitive assay set forth above. 

Still another approach is to use solubilized, 
unpurified or solubilized, purified SOCS or WDS protein 
from transformed eukaryotic or prokaryotic host cells. 

15 This allows for a "molecular" binding assay with the 
advantages of increased specificity, the ability to 
automate, and high drug test throughput. 

Another technique for drug screening involves an 
approach which provides high throughput screening for 

20 compounds having suitable binding affinity to a SOCS or 
WDS protein antibody and is described in detail in 
Geysen, European Patent Application 84/03564, published 
on September 13, 1984. First, large numbers of different 
small peptide test compounds are synthesized on a solid 

25 substrate, e.g., plastic pins or some other appropriate 

surface, see Fodor, et al., supra. Then all the pins are 
reacted with solubilized, unpurified or solubilized, 
purified SOCS or WDS protein antibody, and washed. The 
next step involves detecting bound SOCS or WDS protein 

30 antibody. 

Rational drug design may also be based upon 
structural studies of the molecular shapes of the SOCS or 
WDS protein and other effectors or analogs. See, e.g., 
Methods in Enzvmoloav vols 202 and 203. Effectors may be 

35 other proteins which mediate other functions in response 
to protein binding, or other proteins which normally 
interact with the binding partner. One means for 
determining which sites interact with specific other 
proteins is a physical structure determination, e.g., x- 

40 ray crystallography or 2 dimensional NMR techniques. 
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These will provide guidance as to which amino acid 
residues form molecular contact regions. For a detailed 
description of protein structural determination, see, 
e.g., Blundell and Johnson (1976) Protein Crystal loaraohv 

5 Academic Press, NY. 

A purified SOCS or WDS protein can be coated 
directly onto plates for use in the aforementioned drug 
screening techniques. However, non-neutralizing 
antibodies to these proteins can be used as capture 
10 antibodies to immobilize the respective protein on the 
solid phase. 

X. Kits 

This invention also contemplates use of SOCS or WDS 

15 proteins, fragments thereof, peptides, and their fusion 
products in a variety of diagnostic kits and methods for 
detecting the presence of SOCS or WDS protein or a 
binding partner. Typically the kit will have a 
compartment containing either a defined SOCS or WDS 

20 protein peptide or gene segment or a reagent which 
recognizes one or the other, e.g., binding partner 
fragments or antibodies. 

A kit for determining the binding affinity of a test 
compound to a SOCS or WDS protein would typically 

25 comprise a test compound; a labeled compound, e.g., a 

binding agent or antibody having known binding affinity 
for the SOCS or WDS protein; a source of SOCS or WDS 
protein (naturally occurring or recombinant) ; and a means 
for separating bound from free labeled compound, such as 

30 a solid phase for immobilizing the SOCS or WDS protein • 
Once compounds are screened, those having suitable 
binding affinity to the SOCS or WDS protein can be 
evaluated in suitable biological assays, as are well 
known in the art, to determine whether they act as 

35 agonists or antagonists to the binding partner. .The 
availability of recombinant SOCS or WDS protein 
polypeptides also provide well defined standards for 
calibrating such assays . 

A preferred kit for determining the concentration 

40 of, for example, a SOCS or WDS protein in a sample would 
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typically comprise a labeled compound, e.g., binding 
partner or antibody, having known binding affinity for 
the SOCS or WDS protein, a source of SOCS or WDS protein 
(naturally occurring or recombinant) , and a means for 
5 separating the bound from free labeled compound, for 
example, a solid phase for immobilizing the SOCS or WDS 
protein. Compartments containing reagents, and 
instructions, will normally be provided. 

Antibodies, including antigen binding fragments, 

10 specific for the SOCS or WDS protein or fragments thereof 
are useful in diagnostic applications to detect the 
presence of elevated levels of SOCS or WDS protein and/or 
its fragments. Such diagnostic assays can employ 
lysates, live cells, fixed cells, immunof luorescence, 

15 cell cultures, body fluids, and further can involve the 

detection of antigens related to the protein in serum, or 
the like. Diagnostic assays may be homogeneous (without 
a separation step between free reagent and antigen-SOCS 
or -WDS protein complex) or heterogeneous (with a 

20 separation step) . Various commercial assays exist, such 
as radioimmunoassay (RIA) , enzyme-linked 
immunosorbentassay (ELISA) , enzyme immunoassay (EIA) , 
enzyme-multiplied immunoassay technique (EMIT) , 
substrate-labeled fluorescent immunoassay (SLFIA) , and 

25 the like. For example, unlabeled antibodies can be 

employed by using a second antibody which is labeled and 
which recognizes the antibody to a SOCS or WDS protein or 
to a particular fragment thereof. Similar assays have 
also been extensively discussed in the literature. See, 

30 e.g., Harlow and Lane (1988) Antibodies: A Laboratory 

Manual , CSH Press, NY; Chan (ed.) (1987) Immunoassay: A 
Practical Guide Academic Press, Orlando, FL; Price and 
Newman (eds.) (1991) Principles and Practice of 
Immunoassay Stockton Press, NY; and Ngo (ed. ) (1988) 

35 Nonisotooic Immunoassay Plenum Press, NY. 

Anti-idiotypic antibodies may have similar use to 
diagnose presence of antibodies against a SOCS or WDS 
protein, as such may be diagnostic of various abnormal 
states. For example, overproduction of SOCS or WDS 

40 protein may result in production of various immunological 
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or other medical reactions which may be diagnostic of 
abnormal physiological states, e.g., in cell growth, 
activation, or differentiation. 

Frequently, the reagents for diagnostic assays are 
5 supplied in kits, so as to optimize the sensitivity of 

the assay. For the subject invention, depending upon the 
nature of the assay, the protocol, and the label, either 
labeled or unlabeled antibody or binding partner, or 
labeled SOCS or WDS protein is provided. This is usually 

10 in conjunction with other additives, such as buffers, 
stabilizers, materials necessary for signal production 
such as substrates for enzymes, and the like. 
Preferably, the kit will also contain instructions for 
proper use and disposal of the contents after use. 

15 Typically the kit has compartments for each useful 

reagent. Desirably, the reagents are provided as a dry 
lyophilized powder, where the reagents may be 
reconstituted in an aqueous medium providing appropriate 
concentrations of reagents for performing the assay. 

20 Many of the aforementioned constituents of the drug 

screening and the diagnostic assays may be used without 
modification, or may be modified in a variety of ways. 
For example, labeling may be achieved by covalently or 
non-covalently joining a moiety which directly or 

25 indirectly provides a detectable signal. In any of these 
assays, the protein, test compound, SOCS or WDS protein, 
or antibodies thereto can be labeled either directly or 
indirectly. Possibilities for direct labeling include 
label groups: radiolabels such as 125 I, enzymes (U.S. 

30 Pat. No. 3,645,090) such as peroxidase and alkaline 
phosphatase, and fluorescent labels (U.S. Pat. No. 
3,940,475) capable of monitoring the change in 
fluorescence intensity, wavelength shift, or fluorescence 
polarization. Possibilities for indirect labeling 

35 include biotinylation of one constituent followed by 
binding to avidin coupled to one of the above label 
groups . 

There are also numerous methods of separating the 
bound from the free protein, or alternatively the bound 
40 from the free test compound. The SOCS or WDS protein can 
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be immobilized on various matrices followed by washing. 
Suitable matrices include plastic such as an ELISA plate, 
filters, and beads. Methods of immobilizing the SOCS or 
WDS protein to a matrix include, without limitation, 
5 direct adhesion to plastic, use of a capture antibody, 
chemical coupling, and biotin-avidin. The last step in 
this approach involves the precipitation of 
protein/binding partner or antigen/antibody complex by 
any of several methods including those utilizing, e.g., 

10 an organic solvent such as polyethylene glycol or a salt 
such as ammonium sulfate. Other suitable separation 
techniques include, without limitation, the fluorescein 
antibody magnetizable particle method described in 
Rattle, et al. (1984) Clin, Chem. 30:1457-1461, and the 

15 double antibody magnetic particle separation as described 
in U.S. Pat. No. 4,659,678. 

Methods for linking proteins or their fragments to 
the various labels have been extensively reported in the 
literature and do not require detailed discussion here. 

20 Many of the techniques involve the use of activated 

carboxyl groups either through the use of carbodiimide or 
active esters to form peptide bonds, the formation of 
thioethers by reaction of a mercapto group with an 
activated halogen such as chloroacetyl , or an activated 

25 olefin such as maleimide, for linkage, or the like. 

Fusion proteins will also find use in these applications. 
Another diagnostic aspect of this invention involves 
use of oligonucleotide or polynucleotide sequences taken 
from the sequence of a SOCS or WDS protein. These 

30 sequences can be used as probes for detecting levels of 
the SOCS or WDS protein message in samples from natural 
sources, or patients suspected of having an abnormal 
condition, e.g., cancer or developmental problem. The 
preparation of both RNA and DNA nucleotide sequences, the 

35 labeling of the sequences, and the preferred size of the 
sequences has received ample description and discussion 
in the literature. Normally an oligonucleotide probe 
should have at least about 14 nucleotides , usually at 
least about 18 nucleotides, and the polynucleotide probes 
40 may be up to several kilobases. Various labels may be 
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employed, most commonly radionuclides, particularly 

32 P . 

However, other techniques may also be employed, such as 
using biotin modified nucleotides for introduction into a 
polynucleotide. The biotin then serves as the site for 
5 binding to avidin or antibodies, which may be labeled 
with a wide variety of labels, such as radionuclides, 
f luorophores, enzymes, or the like. Alternatively, 
antibodies may be employed which can recognize specific 
duplexes, including DNA duplexes, RNA duplexes, DNA-RNA 

10 hybrid duplexes, or DNA-protein duplexes. The antibodies 
in turn may be labeled and the assay carried out where 
the duplex is bound to a surface, so that upon the 
formation of duplex on the surface, the presence of 
antibody bound to the duplex can be detected. The use of 

15 probes to the novel anti-sense RNA may be carried out 

using many conventional techniques such as nucleic acid 
hybridization, plus and minus screening, recoitibinational 
probing, hybrid released translation (HRT) , and hybrid 
arrested translation (HART) . This also includes 

20 amplification techniques such as polymerase chain 
reaction (PCR) . 

Diagnostic kits which also test for the qualitative 
or quantitative presence of other markers are also 
contemplated. Diagnosis or prognosis may depend on the 

25 combination of multiple indications used as markers. 
Thus, kits may test for combinations of markers. See, 
e.g., Viallet, et al. (1989) Progress in Growth Factor 
Res. 1:89-97. 

The broad scope of this invention is best understood 
30 with reference to the following examples, which are not 
intended to limit the invention to specific embodiments. 
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EXAMPLES 

I. General Methods 
5 Many of the standard methods below are described or 

referenced, e.g., in Maniatis, et al. (Cur. ed..) 
Molecular Cloning, A Laboratory Manual Cold Spring Harbor 

Laboratory, Cold Spring Harbor Press, NY; Sambrook, et 
al. (1989) Molecular Cloning: A Laboratory Manual (2d 
10 ed.) Vols. 1-3, CSH Press, NY; Ausubel, et al., Biology 

Greene Publishing Associates, Brooklyn, NY; or Ausubel, 
et al. (1987 and Supplements) Current Protocols in 
Molecular Biology Wiley/Greene, NY; Innis, et al. (eds.) 
(1990) PCR Protocols: A Guide to Methods a nd Applications 

15 Academic Press, NY. Methods for protein purification 
include such methods as ammonium sulfate precipitation, 
column chromatography, electrophoresis, centrifugation, 
crystallization, and others. See, e.g., Ausubel, et al. 
(1987 and periodic supplements); Deutscher (1990) "Guide 

20 to Protein Purification," Methods in Enzvmology vol. 182, 

and other volumes in this series; Coligan, et al. (1995 
and supplements) Current Protocols in Protein Science 

John Wiley and Sons, New York, NY; P. Matsudaira (ed.) 
(1993) A Practical Guide to Protein and Peptide 

25 Purification for Microseouencing . Academic Press, San 
Diego, CA; and manufacturer's literature on use of 
protein purification products, e.g., Pharmacia, 
Piscataway, NJ, or Bio-Rad, Richmond, CA. Combination 
with recombinant techniques allow fusion to appropriate 

30 segments (epitope tags), e.g., to a FLAG sequence or an 
equivalent which can be fused, e.g., via a protease- 
removable sequence. See, e.g., Hochuli (1989) Chemische 
Industrie 12:69-70; Hochuli (1990) "Purification of 

Recombinant Proteins with Metal Chelate Absorbent" in 
35 Setlow (ed.) Genetic Engineering, Principle and Methods 

12:87-98, Plenum Press, NY; and Crowe, et al. (1992) 
OIAexoress: The High Level Expression & Protein 
Purification System QUIAGEN, Inc., Chatsworth, CA. 

Standard immunological techniques are described, 
40 e.g., in Hertzenberg, et al. (eds. 1996) Weir 1 s H anhpoV 
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of Experimental Immunology vols 1-4, Blackwell Science; 
Coligan (1991) Current Protocols in Immunology 
Wiley/Greene, NY; and Methods in Enzvmology volumes. 70, 
73, 74, 84, 92, 93, 108, 116, 121, 132, 150, 162, and 
5 163 . Assays for neural cell biological activities are 
described, e.g., in Wouterlood (ed. 1995) Neuroscience 
Protocols modules 10, Elsevier; Methods in Neurosciences 
Academic Press; and Neuromethods Humana Press, Totowa, 

NJ. Methodology of developmental systems is described, 
10 e.g., in Meisami (ed. ) Handbook of Human Growth and 
Developmental Biology CRC Press; and Chrispeels (ed.) 
Molecular Techniques and Approaches in Developmental 
Biology Interscience . 

FACS analyses are described in Melamed, et al. 
15 (1990) Flow Cytometry and Sorting Wilev-Liss. Inc., New 
York, NY; Shapiro (1988) Practical Flow Cytometry Liss, 
New York, NY; and Robinson, et al. (1993) Handbook of 
Flow Cytometry Methods Wiley-Liss, New York, NY. 

20 II. Isolation of full length SOCS or WDS clones 

Standard methods are used to isolate full length 
genes. A cDNA library from an appropriate, e.g., human, 
cell, preferably a STAT containing cell type. The 
appropriate sequence is selected, and hybridization at 

25 high stringency conditions is performed to find a full 
length corresponding gene. It is noted that the mouse 
and human protein sequences are virtually identical. 

III. Isolation of primate S0CS14 or SOCS15 clones 
30 The full length, or appropriate fragments, of human 

genes are used to isolate a corresponding monkey or other 
primate gene. Preferably a full length coding sequence 
is used for hybridization. Similar source materials as 
indicated above are used to isolate natural genes, 
35 including genetic, polymorphic, allelic, or strain 

variants. Other species variants are also isolated using 
similar methods. 
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IV. Isolation of an avian SOCS14 or SOCS15 clone 

An appropriate avian source is selected as above. 
Similar methods are utilized to isolate a species 
variant, though the level of similarity will typically be 
5 lower for avian protein as compared to a human to mouse 
sequence . 

V. Expression; purification; characterization 
Proteins of interest are immunoprecipitated and 

10 affinity purified as described above, e.g., from a 
natural or recombinant source. 

Alternatively, with an appropriate clone from above, 
the coding sequence is inserted into an appropriate 
expression vector. This may be in a vector specifically 

15 selected for a prokaryote, yeast, insect, or higher 

vertebrate, e.g., mammalian expression system. Standard 
methods are applied to produce the gene product, 
preferably as a soluble secreted molecule, but will, in 
certain instances, also be made as an intracellular 

20 protein. Intracellular proteins typically require cell 
lysis to recover the protein, and insoluble inclusion 
bodies are a common starting material for further 
purif iciation . 

With a clone encoding a vertebrate S0CS14 or S0CS15 

25 protein, recombinant production means are used, although 
natural forms may be purified from appropriate sources. 
The protein product is purified by standard methods of 
protein purification, in certain cases, e.g., coupled 
with immunoaf f inity methods. Immunoaf f inity methods are 

30 used either as a purification step, as described above, 
or as a detection assay to determine the separation 
properties of the protein. 

Preferably, the protein is secreted into the medium, 
and the soluble product is purified from the medium in a 

35 soluble form. Alternatively, as described above r 

inclusion bodies from prokaryotic expression systems are 
a useful source of material. Typically, the insoluble 
protein is solubilized from the inclusion bodies and 
refolded using standard methods. Purification methods 

40 are developed as described above. 
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The product of the purification method described 
above is characterized to determine many structural 
features. Standard physical methods are applied, e.g., 
amino acid analysis and protein sequencing. The 
5 resulting protein is subjected to CD spectroscopy and 
other spectroscopic methods, e.g., NMR, ESR, mass 
spectroscopy, etc. The product is characterized to 
determine its molecular form and size, e.g., using gel 
chromatography and similar techniques. Understanding of 
10 the chromatographic properties will lead to more gentle 
or efficient purification methods. 

Prediction of glycosylation sites may be made, e.g., 
as reported in Hansen, et al. (1995) Biochem. J. 308:801- 
813. However, as intracellular proteins, they are 
15 unlikely to be normally glycosylated. 

The purified protein is also be used to identify 
other binding partners of SOCS or WDS as described, e.g., 
in Fields and Song (1989) Nature 340:245-246. 

20 VI. Preparation of antibodies against vertebrate SOCS or 
WDS 

With protein produced, as above, animals are 
immunized to produce antibodies. Polyclonal antiserum is 
raised using non-purified antigen, though the resulting 

25 serum will exhibit higher background levels. Preferably, 
the antigen is purified using standard protein 
purification techniques , including , e.g., affinity 
chromatography using polyclonal serum indicated above. 
Presence of specific antibodies is detected using defined 

30 synthetic peptide fragments. 

Polyclonal serum is raised against a purified 
antigen, purified as indicated above, or using, e.g., a 
plurality of, synthetic peptides. A series of 
overlapping synthetic peptides which encompass all of the 

35 full length sequence, if presented to an animal, will 
produce serum recognizing most linear epitopes on the 
protein. Such an antiserum is used to affinity purify 
protein, which is, in turn, used to introduce intact full 
length protein into another animal to produce another 

40 antiserum preparation. 
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Similar techniques are used to generate induce 
monoclonal antibodies to either unpurified antigen, or, 
preferably, purified antigen. 



5 VII. Cellular and tissue distribution 

Distribution of the protein or gene products are 
determined, e.g., using immunohistochemistry with an 
antibody reagent, as produced above, by Western blotting 
of cell lysates, or by screening for nucleic acids 

10 encoding the respective protein. Either hybridization or 
PCR methods are used to detect DNA, cDNA, or message 
content. Histochemistry allows determination of the 
specific cell types within a tissue which express higher 
or lower levels of message or DNA. Antibody techniques 

15 are useful to quantitate protein in a biological sample, 
including a liquid or tissue sample. Immunoassays are 
developed to quantitate protein. Also FACS analysis may 
be used to evaluate expression in a cell population. 
Appropriate tissue samples or cell types are isolated and 

20 prepared for such detection. Commercial tissue blots are 
available, e.g., from Clontech (Mountain View, CA) . 
Alternatively, cDNA library Southern blots can be 
analyzed. 

25 VIII. STAT interference by SOCS or WDS proteins 

Standard methods for testing the biological activity 
of the SOCS gene products in STAT signaling are 
described, e.g., in Starr, et al. (1997) Nature 387:917- 
921; Endo, et al. (1997) Nature 387:921-924; and Naka, et 

30 al. Nature 387:924-929. Alternatively, JAK/STATs are 
necessary for signal transduction. This assay is 
performed as described, e.g., in Ho, et al. (1995) Mbl. 
Cell. Biol. 15:5043-5-53, and blockage with these gene 
products may be tested. 

35 In particular, the STAT5 dependent signaling in 

response to IL-2 is inhibited by the SOCS family member 
SOCS3. 



40 



WO 99/03993 PCT/US98/1 4544 

67 

IX. Antagonists of SOCS function 

The inhibition of SOCS function may be effected by 
inhibitors of the specific interaction of these gene 
products and their respective STAT molecules. With the 
5 information on the specificity of pairings between these 
SOCS and respective STAT family members, compound 

y libraries may be screened for blockage of such 

interactions. Thus, inhibitory action of the SOCS may be 
blocked with small molecule drug candidates. 

10 Methods of using gene therapy are described, e.g., 

in Goodnow (1992) "Transgenic Animals" in Roitt (ed.) 
Encyclopedia of Immunology . Academic Press, San Diego, 
pp. 1502-1504; Travis (1992) Science 256:1392-1394; Kuhn, 
et al. (1991) Science 254:707-710; Capecchi (1989) 

15 Science 244:1288; Robertson (1987) (ed.) Teratocarcinomas 
and Embry onic Stem Cells: A Practical Approach, IRL 
Press, Oxford; and Rosenberg (1992) J. Clinical Oncology 
10:180-199. Also included is the use of antisense RNA in 
gene therapy to block expression of the target gene, or 

20 proper splicing of gene transcripts. 

X. Comparison of various SOCS embodiments 

Tables 1 and 2 show comparison of various SOCS or 
WDS embodiments. Table 1 shows comparisons of the 

25 relevant portions of the gene products, particularly in 
the region of SOCS14 from Metl68 to Leu293 . 

Table 2 shows alignment of the WDS "SOCSBOX protein" 
with a consensus of the mouse and human SOCS15 (WDS11) 
protein sequences, which are identical. See GenBank 

30 Accession numbers U88325; U88326; U88327; U88328; 

AB000676; AB000677; AB000710. This is aligned with the 
new WDS12, SEQ ID NO: 16. 
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Table 1: Comparison of various SOCS family members. mCIS is SEQ 
ID NO: 15; hSOCSl IS SEQ ID NO: 16; mSOCSl is SEQ ID NO: 17; hSOCS2 
is SEQ ID NO: 18; hSOCS3 is SEQ ID NO: 19; mSOCS3 is SEQ ID NO: 20; 
and human SOCS16 is SEQ ID NO: 21. 



mCIS 
hSOCSl 
mSOCSl 
hSOCS2 

10 hSOCS3 
mSOCS3 
hSOCS14 
mSOCSl 7 
hSOCS18 

15 hSOCS19 



MEVRVKALVHSSS 

AELGEIR PES AQKKLPLRKA 

MDKVGKMWNNLKYRCQNLFSHEGGSRNENV^MNPNRCPSVKEKSI SLGEA 

ERGLETNSCSEEELSSPGRGGGGGGRLLLQ 



mCIS 
hSOCSl 
20 mSOCSl 

hSOC S 2 ALS PAATLTAWP ADS ARRGP 

hSOCS3 
mSOCS3 

hSOCSl 4 PSPAIjNGVRKDFHDLQSETTCQEQANSLKSSASHNGDLHLHLDEHVPWI 

2 5 mSOCSl 7 EN TIFITLEIVKNLFKMAENNSKNVDVRPKTSRSRSAD- 

hSOCSl 8 APQQESSPLRENVALQLGLSPSKTFSRRNQNCAAEIPQWEISIEKDSDS 
hSOCSl9 PPGPELPPVPFPLQDLVPLGRLSRGEQQQQQQQQPPPPPPPPGPLRPIiAG 



30 



35 



mCIS 

hSOCSl 

mSOCSl 

hSOCS2 

hSOCS3 

mSOCS3 

hSOCS14 

mSOCS17 

hSOCSl 8 

hSOCSl 9 



G : LMPQDYIQYTVPLDEGMYPLEGSRS 

RKD GYVWSGKK-LSWSKKSESCSESEAKKG 

GATPGTRLARRDSYSRHAPWGGKKKHSC STKTQS SLDTEKKFGRTRSGLQ 
PSRKGSFKIRLSRLFRTKSCNGGSGG 



40 



mCIS 

hSOCSl 

mSOCSl 

45 hSOCS2 
hSOCS3 
mSOCS3 
hSOCSl 4 
mSOCSl7 

50 hSOCSl 8 
hSOCS19 



MVLCVQG 



•GCTASGYPVPAARA- PAAGDQWVT — AAARDFVIR — PPGSGEKE 



YCLDSS S PMEVSAVPPQVGGRAF PEDESQVDQDLWAPEI FVDQS 

QLSCSSIELDLDHSCG-HRFLGRSLK— QKLQDAVGQCFPIKNCSGR 

RRERRYGVS SMQDMDSVS S-RAVGSRSLR — QRLQDTVGLCF PMRTYSKQ 
GDGTGKRPSGELAAS- AASLTDMGG — SAGRELDAGRKPKLTRTQS 
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Table 1 (continued) : 



10 



mCIS 

hSOCSl 

raSOCSl 

hSOCS2 

hSOCS3 

mSOCS3 

hSOCS14 

mSOCS17 

hSOCS18 

hSOCS19 



SCPLLAVEQIGRR- PLWAQSLELPGPA MQPLPTGA 

MVAHNQVAADN AVSTAAEPR 

MVARNQVAADN AISPAAEPR— 

PHPFSLCHHFGHPAGLVLGFALTSRKD ANPSLTPARAAT 

MVTHSKFPAAG MSRPLDTSL 

MVTHSKF PAAG MSRPLDTSL 

VNGLLIGTTGVMLQSPRAGHDDVPPLS PLLPPMQNNQ 

HSPGLPSKRKIHISELMLDXCXFPPRSDLAFRWHFIKRHTVPMSPNS 

SKPLFSNKRKIHLSELMLEKCPFPAGSDLAQKWHLIKQHTAPVSPHSTFF 
AFSPVSFSPLFTGETVSLVDVDISQRG LTSPHPPTP 



15 



20 



mCIS 

hSOCSl 

mSOCSl 

hSOCS2 

hSOCS3 

mSOCS3 

hSOCS14 

IUSOCS17 

hSOCS18 

hSOCS19 



RRPE PSSSSSSS PAA 

RRSE PSSSSSSSS PAA 

CLCRGD PS LMTLR 

R 

R 

IQRNFS GLT 

DEWVSADLSERKLRDAQLKRRNTEDDIPCFSHTNGQPCVITANSAS 

DTFDPSLVSTEDEEDRLRERRRL S I EEGVDPPPNAQ IHTFEATAQVNPLF 
PPPPRRSLSLLDDISGTLPTSVLVAPMGSSLQSFPLP 



25 



mCIS 

hSOCSl 

raSOCSl 

3 0 hSOCS2 
hSOCS3 
mSOCS3 
hSOCS14 
mSOCS17 

35 hSOCS18 
hSOCS19 



-FPEEVTEETPVQAENE- • 
PARPRPCPAVPAPAPGD- • 
PVRPRPCPAVPAPAPGD- • 
CLEPSGNGGEGTRSQWG- • 



PKVLDP- 

•THFRTFRS- 
•THFRTFRS- 
•TAGSAEEP- 
LKTFSS- 



LKTFSS 

GTEAHVAE SMRCHLNFD PNSAPGVARVYDSVQ 

CTGGHITGSMMNLVTNN- S IEDSDMDSEDEI ITLCTSSRKRNKPR- -WEM 
KLGPKLAPGMTEI SGDSSAI PQANCDSEEDTTTLCLQSR-RQKQRQI SGD 
PPPPPHAPDAFPRIAPIR AAESLHSQPP 



mCIS EGDLLCIAKTFSYLRES GWYWG S I T A S EARQHLQ 

40 hSOCSl HADYRRITRASALLDAC GFYWGPLSVHGAHERLR 

mSOCSl HSDYRRITRTSALLDAC GFYWGPLSVHGAHERLR 

hSOCS2 SPQAARLAKALRELGQT GWYWG SMTVNEAKEKLK 

hSOCS 3 KSEYQLWNAVRKLQES GFYWSAVTGGEANLLLS 

mSOCS 3 KSEYQLWNAVRKLQES GFYWSAVTGGEANLLLS 

45 hSOCSl 4 SSGPMWTSLTEELKKLAKQGWYWGPITRWEAEGKLA 

mSOCS17 EEEILQLEAPPKFHTQIDYVHCLVPDLLQISNNPC^GVMDKYAAEALLE 

hSOCSl 8 SHTHVSRQGAWKVHTQIDYIHCLVPDLLQITGNPCYWGVMDRYEAEALSE 

hSOCSl 9 QHLQCPLYRPDSSSFAASLRELEKC GWYWG PMNWED AEMKLK 

* ** * 
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Table 1 (continued) 



10 



mCIS 

hSOCSl 

mSOCSl 

hSOCS2 

hSOCS3 

mSOCS3 

hSOCS14 

n\SOCS17 

hSOCSl8 

hSOCS19 



KMPEGTFLVRDST-HPSYLFTLSVKTTRGPTNVRIEYADSSFRLDSNCLS 

AEPVGTFLVRDSR- QRNCFFALSVKMASGPTS IRVHFQAGRFHLDGS -R- 

AEPVGTFLVRDSR-QRNCFFALSVKMASGPTSIRVHFQAGRFHLDGS-R- 

EAPEGTFLIRDSS-HSDYLLTISVKTSAGPTNLRIEYQDGKFRLDSIICV 

AEPAGTFLIRDSSDQR-HFFALSVKTQSGTKNLRIQCEGGSFSLQSDPRS 

AEPAGTFLIRDSSDQR-HFFTLSVKTQSGTKNLRIQCEGGSFSLQSDPRS 

WPDGSFLVRDSS-DDRYLLSLSFRSHGKTLHTRIEHSNGRFSFYEQPD- 

GKPEGTFLLRDSA-QEDYLFSVSFRRYSRSLHARIEQWNHNFSFDAHDP- 

GKPEGTFLLRDSA-QEDYLFSVSSAATTGSLHARIEQWNHNFSFDAHDP- 

GKPDGSFLVRDSS-DPRYILSLSFRSQGITHHTRMEHYRGTFSLWCHPKF 
* ★ ** *** * * * 



15 



20 



25 



mCIS 

hSOCSl 

mSOCSl 

hSOCS2 

hSOCS3 

mSOCS3 

hSOCSl 4 

mSOCS17 

hSOCSl 8 

hSOCSl 9 



RP -RILAFPDWSLVQHYVASCAADTRSDS PD PAPTPALPMSKQDAPSDS 

ESFDCLFELLEHYVAAP RRMLG 

ETFDCLFELLEHYVAAP RRMLG 

KS -KLKQFDSWHLIDYYVQMCKDK RTGPEAPRNG 

TQ-PVPRFDCVLKLVYHYMPPPGAPSFP-SPPTEPSSEVPEQPSAQPLPG 
TQ-PVPRFDCVLKLVHHYMPPPGTPSFS-LPPTEPSSEVPEQPPAQALPG 

VERTYSIVDLIEHSIQGLENG AFCYSRSRLPGSA 

CVFHSPDITGLLEHYKDPSA CMFFEPLLS 

CVFHS STVTGLLEHYKDPS S CMFFEPLLT 

EDRCQSWEFIKRAIMHSKNGK FLYFLRSRVPGLP 



mCIS 

hSOCSl 

mSOCSl 

30 hSOCS2 
hSOCS3 
mSOCS3 
hSOCS14 
mSOCSl 7 

35 hSOCSl 8 
hSOCSl 9 



VLPI PVATAVHLKLVQPFVRRSS ARSLQHLCRLVTNRLVA DVD 

APLRQRR VRPLQELCRQRI VATVG- RENLA 

APLRQRR VRPLQELCRQRIVAAVG-RENLA 

TVHLYLTKPLYTSAPSLQHLCRLTINKCTG AIW 

S P PRRAYY I YSGGEKI PLVLSRPLS SNVATLQHLCRKTVNGHLD S YEKVT 
STPKRAYYI YSGGEKI PLVLSRPLS SNVATLQHLCRKTVNGHLDS YEKVT 

TYP VRLTNPVS RFMQVRS LQ YLCRFVI RQ YTR - 1 DL I Q 

TPLIRTFP FSLQHICRTVICNCTT-YDGID 

ISLNRTFP FSLQYICRAVICRCTT-YDGID 

PTP VQLLYPVSRFSNVKSLQHLCRFRIRQLVR- IDHI P 



* * 



mCIS 
40 hSOCSl 
mSOCSl 
hSOCS2 
hSOCS3 
mSOCS3 
45 hSOCSl 4 
mSOCS17 
hSOCSl 8 
hSOCS19 



CLPLPRRMADYLRQYPFQL 
RIPLNPVLRDYLSSFPFQI 
RIPLNPVLRDYLSSFPFQI 
GLPLPTRLKDYLEEYKFQV 
QLPG- P- IREFLDQYDAPL 
QLPG- P- IREFLDQYDAPL 
KLPLPNKMKDYLQEKHY 

ALPI PSPMKLYLKEYHYKSKVRLLRIDVPEQQ 

GLPLPSMLQDFLKEYHYKQKVRVRWLEREPVKAK 

DLPLPKPLI S YIRKFYYYDPQEEVYLSLKEAQLI SKQKQEVEPST 
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Table 2: Comparison of the WDS family members; WDS11 (SOCS15) and 
WDS12 . 

WDS12- MLNIILIKFSSFSIRCAILSSVCLNEAITFAFLI^WLWNMDKYTMIRKL 

5 WDSll ( SOCS15 ) MLCSAAG EKSVFLWSMRSYTLIRKL 

**** * ** **** 



* 

* • 



WDS12 - EGHHHDWACDFS PDGALLATAS YDTRVYIVJDPHNGDI LMEFGHLF PPPT 

WDSll ( SOCS15 ) EGHQSSWSCDFSPDSALLVTASYDTSVIMWDPYTGERLRSLHHTQLEPT 



*** 



** ****** *** ****** * *** * * 



* * 



WDS12- 

WDSll(socsl5) 



WDS12- 

WDSll(socsl5) 



PI FAGGANDRWVRSVSF SHDGLHVASLADDKMVRFWRIDEDYPVQVAP LS 
MDDSD-VHMSSLRSVCFSPEGLYLATVADDRLLRIWALELKAPVAFAPMT 



*** ** ** * *** * * 



** ** 



NGLCCAFSTIX3SVLAAGTHDGSVYFWATPRQVPSLQHLCRMSIRRVMPTQ 
NGLCCTFFPHG-GIATGTRDGHVQFWTAPRVLSSLKHLCRKALRSFLTTY 



***** * 



* ** ** * ** ** 



* * * * * * * 



20 WDS12- EVQELPIPSKLLEFLSYRI 219 

WDSll (SOCS15) QVLALPI PKKMKEFLTYRTF 193 



* **** * *** ** 



All references cited herein are incorporated herein 
25 by reference to the same extent as if each individual 
publication or patent application was specifically and 
individually indicated to be incorporated by reference in 
its entirety for all purposes. 

Many modifications and variations of this invention 
30 can be made without departing from its spirit and scope, 
as will be apparent to those skilled in the art. The 
specific embodiments described herein are offered by way 
of example only, and the invention is to be limited only 
by the terms of the appended claims, along with the full 
35 scope of equivalents to which such claims are entitled. 
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WHAT IS CLAIMED IS: 

1. An isolated or recombinant polypeptide comprising: 
5 a) at least 17 contiguous amino acids from the 

coding portion of SEQ ID NO: 2 or 6; 

b) at least 17 contiguous amino acids from the 

coding portion of SEQ ID NO: 4 or 8; 

c) at least 17 contiguous amino acids from the 
10 coding portion of SEQ ID NO: 10; 

d) at least 17 contiguous amino acids from the 

coding portion of SEQ ID NO: 12; 

e) at least 17 contiguous amino acids from the 

coding portion of SEQ ID NO: 14; or 
15 f ) at least 17 contiguous amino acids from the 

coding portion of SEQ ID NO: 16. 

2. The polypeptide of claim 1, comprising the amino 
20 acid sequence of: 

a) a S0CS14 of SEQ ID NO: 2 or 6; 

b) a S0CS15 (WDS11) of SEQ ID NO: 4 or 8; 

c) a S0CS17 of SEQ ID NO: 10; 

d) a SOCS18 of SEQ ID NO: 12; 

25 e) a S0CS19 of SEQ ID NO: 14; or 

f) a WDS12 of SEQ ID NO: 16. 

3. A fusion protein comprising the polypeptide of claim 
30 1 or 2. 

4. A binding compound which specifically binds to the 
polypeptide of claim 1 or 2 . 

35 5. The binding compound of claim 4 which is an antibody 
or antibody fragment. 



40 



45 



6. A nucleic acid encoding the polypeptide of claim 1 
or 2. 

7. An expression vector comprising the nucleic acid of 
claim 6. 

8. A host cell comprising the vector of claim 7. 

9. A process for recombinatly producing a polypeptide 
comprising culturing the host cell of claim 8 under 
conditions in which the polypeptide is expressed. 
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SEQUENCE LISTING 



SEQ ID NO: 
SEQ ID NO: 
5 SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

10 SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

15 SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

20 SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

25 SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 
SEQ ID NO: 

30 SEQ ID NO: 
SEQ ID NO: 



1 is primate SOCS14 nucleic acid sequence. 

2 is primate SOCS14 amino acid sequence. 

3 is rodent SOCS15 (WDS11) nucleic acid sequence. 

4 is rodent SOCS15 (WDS11) amino acid sequence. 

5 is primate SOCS14 nucleic acid sequence. 

6 is primate SOCS14 nucleic acid sequence. 

7 is primate SOCS15 (WDS11) amino acid sequence. 

8 is primate SOCS15 (WDS11) nucleic acid sequence. 

9 is rodent SOCS17 amino acid sequence. 

10 is rodent SOCS17 nucleic acid sequence. 

11 is primate SOCS18 amino acid sequence. 

12 is primate SOCS18 nucleic acid sequence. 

13 is primate SOCS19 nucleic acid sequence. 

14 is primate SOCS19 amino acid sequence. 

15 is primate WDS12 nucleic acid sequence. 

16 is mouse WDS12 amino acid sequence. 

17 is mouse CIS amino acid sequence. 

18 is primate SOCS1 amino acid sequence. 

19 is murine SOCS1 amino acid sequence. 

20 is primate SOCS2 amino acid sequence. 

21 is primate SOCS3 amino acid sequence. 

22 is murine SOCS3 amino acid sequence. 

23 is primate SOCS16 amino acid sequence. 

24 is primate SOCS14 nucleotide sequence. 

25 is primate SOCS15 (WDS11) nucleotide sequence. 

26 is rodent SOCS17 nucleotide sequence. 

27 is primate SOCS18 nucleotide sequence. 

28 is primate SOCS19 nucleotide sequence. 

29 is primate WDS12 nucleotide sequence. 



35 
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(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: 
(B) 
(O 
(D) 
(E) 



(F) 



STREET: 
CITY: 
STATE : 
COUNTRY : 



Schering Corporation 

2000 Galloping Hill Road 

Kenilworth 

New Jersey 

USA 



POSTAL CODE: 07033-0530 



45 



(ii) TITLE OF INVENTION: Suppressors of Cytokine Signaling; 
Related Reagents 



(iii) NUMBER OF SEQUENCES: 29 



50 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: Apple Macintosh 

(C) OPERATING SYSTEM: Macintosh 8.0.1 

(D) SOFTWARE: Microsoft Word 6.0 



55 



(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 17-JUL-1998 

(C) CLASSIFICATION: 



WO 99/03993 
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(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/055,804 

(B) FILING DATE: 15-AUG-1997 

5 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/055,853 

(B) FILING DATE: 15-AUG-1997 

10 (vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/053,153 

(B) FILING DATE: 18-JUL-1997 

(vi) PRIOR APPLICATION DATA: 
15 (A) APPLICATION NUMBER: US 60/053,244 

(B) FILING DATE: 18-JUL-1997 



(2) INFORMATION FOR SEQ ID NO:l: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 930 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



30 (ix) FEATURE: 

(A) NAME/KEY: unsure 

(B) LOCATION: 824 

(D) OTHER INFORMATION: /note= "position 824 is ambiguous; 
may be A, C, G, or T; all code for proline" 

35 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3. .929 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AC GAC CTC CAG TCT GAG ACC ACG TGC CAG GAG CAA GCC AAT TCA CTG 
Asp Leu Gin Ser Glu Thr Thr Cys Gin Glu Gin Ala Asn Ser Leu 
45 1 5 10 15 

AAG AGC TCG GCT TCT CAT AAT GGA GAC CTG CAT CTT CAC CTG GAT GAA 
Lys Ser Ser Ala Ser His Asn Gly Asp Leu His Leu His Leu Asp Glu 

20 25 30 

50 

CAT GTG CCT GTC GTT ATT GGA CTT ATG CCT CAG GAC TAC ATT CAG TAT 
His Val Pro Val Val lie Gly Leu Met Pro Gin Asp Tyr lie Gin Tyr 

35 40 45 



55 ACT GTG CCT TTA GAT GAG GGG ATG TAT CCT TTG GAA GGA TCA CGG AGC 
Thr Val Pro Leu Asp Glu Gly Met Tyr Pro Leu Glu Gly Ser Arg Ser 
50 55 60 

TAT TGT CTG GAC AGC TCT TCT CCC ATG GAA GTC TCT GCG GTT CCT CCT 
60 Tyr Cys Leu Asp Ser Ser Ser Pro Met Glu Val Ser Ala Val Pro Pro 

65 70 75 

CAA GTG GGA GGG CGC GCT TTC CCC GAG GAT GAG AGT CAG GTA GAC CAG 
Gin Val Gly Gly Arg Ala Phe Pro Glu Asp Glu Ser Gin Val Asp Gin 



WO 99/03993 



3 



PCI7US98/14544 



10 



80 85 90 95 

GAC CTA GTT GTC GCC CCA GAG ATC TTC GTG GAT CAG TCC GGT GAA TGG 335 
Asp Leu Val Val Ala Pro Glu lie Phe Val Asp Gin Ser Gly Glu Trp 

100 105 110 

CTT GTT GAT TGG CAC CAC GGG AGT CAT GTT GCA GAA CCC CGG AGA GCG 383 
Leu Val Asp Trp His His Gly Ser His Val Ala Glu Pro Arg Arg Ala 

115 120 125 

GGT TCA CGA TGG ATG TCC CTC CAA TCT TCA CCA TTG GTT ACC TCC AAT 431 
Gly Ser Arg Trp Met Ser Leu Gin Ser Ser Pro Leu Val Thr Ser Asn 
130 135 140 

15 GCA GGA ATA ATC CAA ATC CCA AAG GGG ACC TTC AGT GGA CTC ACT GGG 479 
Ala Gly lie lie Gin lie Pro Lys Gly Thr Phe Ser Gly Leu Thr Gly 
145 150 155 

ACA GAA GCC CAC GTG GCT GAA AGT ATG CGC TGT CAT TTG AAT TTT GAT 527 
20 Thr Glu Ala His Val Ala Glu Ser Met Arg Cys His Leu Asn Phe Asp 
160 165 170 175 

CCG AAC TCT GCT CCT GGG GTT GCA AGA GTT TAT GAC TCA GTG CAA AGT 575 
Pro Asn Ser Ala Pro Gly Val Ala Arg Val Tyr Asp Ser Val Gin Ser 
25 180 185 190 

AGT GGT CCC ATG GTT GTG ACA AGC CTT ACA GAG GAG CTG AAA AAA CTT 623 
Ser Gly Pro Met Val Val Thr Ser Leu Thr Glu Glu Leu Lys Lys Leu 

195 200 205 

30 

GCA AAG CAA GGA TGG TAC TGG GGA CCA ATC ACA CGT TGG GAG GCA GAA 671 
Ala Lys Gin Gly Trp Tyr Trp Gly Pro lie Thr Arg Trp Glu Ala Glu 
210 215 220 

35 GGG AAG CTA GCA AAC GTG CCA GAT GGT TCT TTT CTT GTT CGG GAC AGT 719 
Gly Lys Leu Ala Asn Val Pro Asp Gly Ser Phe Leu Val Arg Asp Ser 
225 230 235 

TCT GAC GAC CGT TAC CTT TTA AGC TTG AGC TTT CGC TCC CAT GGT AAA 767 
40 Ser Asp Asp Arg Tyr Leu Leu Ser Leu Ser Phe Arg Ser His Gly Lys 
240 245 250 255 

ACA CTT CAC ACT AGA ATT GAG CAC TCA AAT GGT AGG TTT AGC TTT TAT 815 
Thr Leu His Thr Arg lie Glu His Ser Asn Gly Arg Phe Ser Phe Tyr 
45 260 265 270 

GAA CAG CCC GAT GTG GAA GGA CAT ACG TCC ATA GTT GAT CTA ATT GGA 863 
Glu Gin Pro Asp Val Glu Gly His Thr Ser He Val Asp Leu He Gly 

275 280 285 

50 

GCA TTC AAT CAG GGA CTC TGA AAA TGG GAG CTT TTT GTT ATT CAA GGT 911 
Ala Phe Asn Gin Gly Leu * Lys Trp Glu Leu Phe Val He Gin Gly 
290 295 300 

55 CTC GGC TGC CTG GAA TCT G 930 
Leu Gly Cys Leu Glu Ser 
305 



60 (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Asp Leu Gin Ser Glu Thr Thr Cys Gin Glu Gin Ala Asn Ser Leu Lys 
15 10 15 

10 Ser Ser Ala Ser His Asn Gly Asp Leu His Leu His Leu Asp Glu His 

20 25 30 



15 



30 



Val Pro Val Val lie Gly Leu Met Pro Gin Asp Tyr He Gin Tyr Thr 
35 40 45 

Val Pro Leu Asp Glu Gly Met Tyr Pro Leu Glu Gly Ser Arg Ser Tyr 
50 55 60 



Cys Leu Asp Ser Ser Ser Pro Met Glu Val Ser Ala Val Pro Pro Gin 
20 65 70 75 80 

Val Gly Gly Arg Ala Phe Pro Glu Asp Glu Ser Gin Val Asp Gin Asp 

85 90 95 

25 Leu Val Val Ala Pro Glu He Phe Val Asp Gin Ser Gly Glu Trp Leu 

100 105 110 



Val Asp Trp His His Gly Ser His Val Ala Glu Pro Arg Arg Ala Gly 
115 120 125 

Ser Arg Trp Met Ser Leu Gin Ser Ser Pro Leu Val Thr Ser Asn Ala 
130 " 135 140 



Gly He He Gin He Pro Lys Gly Thr Phe Ser Gly Leu Thr Gly Thr 
35 145 150 155 160 

Glu Ala His Val Ala Glu Ser Met Arg Cys His Leu Asn Phe Asp Pro 

165 170 175 

40 Asn Ser Ala Pro Gly Val Ala Arg Val Tyr Asp Ser Val Gin Ser Ser 

180 185 190 



45 



Gly Pro Met Val Val Thr Ser Leu Thr Glu Glu Leu Lys Lys Leu Ala 
195 200 205 

Lys Gin Gly Trp Tyr Trp Gly Pro He Thr Arg Trp Glu Ala Glu Gly 
210 215 220 



Lys Leu Ala Asn Val Pro Asp Gly Ser Phe Leu Val Arg Asp Ser Ser 
50 225 230 235 240 

Asp Asp Arg Tyr Leu Leu Ser Leu Ser Phe Arg Ser His Gly Lys Thr 

245 250 255 

55 Leu His Thr Arg He Glu His Ser Asn Gly Arg Phe Ser Phe Tyr Glu 

260 265 270 



60 



Gin Pro Asp Val Glu Gly His Thr Ser He Val Asp Leu He Gly Ala 
275 280 285 

Phe Asn Gin Gly Leu * Lys Trp Glu Leu Phe Val He Gin Gly Leu 
290 295 300 

Gly Cys Leu Glu Ser 



305 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 476 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 3.-476 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CA GCT TCG TAT GAC ACC AGT GTG ATT ATG TGG GAC CCC TAC ACC GGC 
Ala Ser Tyr Asp Thr Ser Val lie Met Trp Asp Pro Tyr Thr Gly 
15 10 15 

GAG AGG CTG AGG TCA CTT CAT CAC ACA CAG CTT GAA CCC ACC ATG GAT 
Glu Arg Leu Arg Ser Leu His His Thr Gin Leu Glu Pro Thr Met Asp 

20 25 30 

GAC AGT GAC GTC CAC ATG AGC TCC CTG AGG TCC GTG TGC TTC TCA CCT 
Asp Ser Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ser Pro 

35 40 45 

GAA GGC TTG TAT CTC GCT ACG GTG GCA GAT GAC AGG CTG CTC AGG ATC 
Glu Gly Leu Tyr Leu Ala Thr Val Ala Asp Asp Arg Leu Leu Arg lie 
50 55 60 

TGG GCT CTG GAA CTG AAG GCT CCG GTT GCC TTT GCT CCG ATG ACC AAT 
Trp Ala Leu Glu Leu Lys Ala Pro Val Ala Phe Ala Pro Met Thr Asn 
65 70 75 

GGT CTT TGC TGC ACG TTC TTC CCA CAC GGT GGA ATT ATT GCC ACA GGG 
Gly Leu Cys Cys Thr Phe Phe Pro His Gly Gly lie He Ala Thr Gly 
80 85 90 95 

ACG AGA GAT GGC CAT GTC CAG TTC TGG ACA GCT CCC CGG GTC CTG TCC 
Thr Arg Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser 

100 105 110 

TCA CTG AAG CAC TTA TGC AGG AAA GCC CTC CGA AGT TTC CTG ACA ACG 
Ser Leu Lys His Leu Cys Arg Lys Ala Leu Arg Ser Phe Leu Thr Thr 

115 120 125 

TAT CAA GTC CTA GCA CTG CCA ATC CCC AAG AAG ATG AAA GAG TTC CTC 
Tyr Gin Val Leu Ala Leu Pro He Pro Lys Lys Met Lys Glu Phe Leu 
130 135 140 

ACA TAC AGG ACT TTC TAG CAG TGC CGG CTC CCC CAC CTC CTG CAG 
Thr Tyr Arg Thr Phe * Gin Cys Arg Leu Pro His Leu Leu Gin 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS 
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(A) LENGTH: 158 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Ala Ser Tyr Asp Thr Ser Val lie Met Trp Asp Pro Tyr Thr Gly Glu 
10 1 5 10 15 

Arg Leu Arg Ser Leu His His Thr Gin Leu Glu Pro Thr Met Asp Asp 

20 25 30 

15 Ser Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ser Pro Glu 

35 40. 45 



20 



35 



40 



60 



Gly Leu Tyr Leu Ala Thr Val Ala Asp Asp Arg Leu Leu Arg lie Trp 
50 55 60 

Ala Leu Glu Leu Lys Ala Pro Val Ala Phe Ala Pro Met Thr Asn Gly 
65 70 75 80 



Leu Cys Cys Thr Phe Phe Pro His Gly Gly lie He Ala Thr Gly Thr 
25 85 90 95 

Arg Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser Ser 

100 105 110 

30 Leu Lys His Leu Cys Arg Lys Ala Leu Arg Ser Phe Leu Thr Thr Tyr 

115 120 125 



Gin Val Leu Ala Leu Pro He Pro Lys Lys Met Lys Glu Phe Leu Thr 
130 135 140 

Tyr Arg Thr Phe * Gin Cys Arg Leu Pro His Leu Leu Gin 
145 150 155 

(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
45 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

50 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 87.. 1241 

(ix) FEATURE: 
55 (A) NAME/KEY: misc_feature 

(B) LOCATION: 20 

(D) OTHER INFORMATION: /note= "nucleotide may be A or C at 
positions: 20, 36, 1583, 1675, 1689, 1693, 1710, 1711; 1719, 
1720, 1728, 1753, 1787, and 1806." 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 35 

(D) OTHER INFORMATION: /note= "nucleotide may be G or T at 
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postions: 35, 1541, 1594, 1689, 1778, 1779, 1825, 1844, 1845, 
1853, 1854, 1865, 1884, and 1893." 

(ix) FEATURE: 

(A) NAME /KEY: misc_f eature 

(B) LOCATION: 70 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or G at 
positions: 70, 1461, 1630, 1677, 1713, 1725, 1734, 1735, 1757, 
1805, 1810, and 1863." 



(ix) FEATURE: 

(A) NAME/ KEY: misc_f eature 

(B) LOCATION: 64 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or T at 
15 positions: 64, 1692, 1715, 1718, 1721, 1722, 1799, 1837, 1841, 

1876, and 1894." 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
20 (B) LOCATION: 1661 

(D) OTHER INFORMATION: /note= "Nucleotide may be C or T at 
positions: 1661, 1729, 1749, 1750, 1754, 1776, 1802, 1826, 1847, 
1859, 1860, 1904, 1907, and 1911." 

25 (ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: 1731 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or C at 
positions: 1731, 1817, 1887, and 1908." 

30 

(ix) FEATURE: 

(A) NAME /KEY: roisc_f eature 

(B) LOCATION: 1869 

(D) OTHER INFORMATION: /note= "Nucleotide may be C, G, or 
35 T at positions: 1869, 1883, 1885, 1886, and 1895." 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1888 

40 (D) OTHER INFORMATION: /note= "Nucleotide may be A, C, or 

G at positions: 1888, and 1896." 

(ix) FEATURE: 

(A) NAME/KEY: roisc__f eature 
45 (B) LOCATION: 1877 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, or 
T at positions: 1877, and 1898." 

(ix) FEATURE: 
50 (A) NAME/KEY: misc__f eature 

(B) LOCATION: 1855 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, G, or 
T at position 1855." 

55 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1935 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, G, 
or T at positions: 1935, and 2034." 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



TAAGGTCCAC GTCGCTCCGC AGCCATCACT ACAGGCCCGC GCCGTGGCCT CTGCGGCCCA 
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CAATCTCCGA GGAGACCTGC ATCAAG ATG GAG GTG AGA GTC AAG GCC TTG GTT 113 

Met Glu Val Arg Val Lys Ala Leu Val 
1 5 

5 

CAC TCT TCC AGC CCG AGT CCA GCC CTG AAT GGC GTC CGG AAG GAT TTC 161 
His Ser Ser Ser Pro Ser Pro Ala Leu Asn Gly Val Arg Lys Asp Phe 
10 15 20 25 

10 CAC GAC CTC CAG TCT GAG ACC ACG TGC CAG GAG CAA GCC AAT TCA CTG 209 
His Asp Leu Gin Ser Glu Thr Thr Cys Gin Glu Gin Ala Asn Ser Leu 

30 35 40 

AAG AGC TCG GCT TCT CAT AAT GGA GAC CTG CAT CTT CAC CTG GAT GAA 257 
15 Lys Ser Ser Ala Ser His Asn Gly Asp Leu His Leu His Leu Asp Glu 

45 50 55 

CAT GTG CCT GTC GTT ATT GGA CTT ATG CCT CAG GAC TAC ATT CAG TAT 305 
His Val Pro Val Val lie Gly Leu Met Pro Gin Asp Tyr lie Gin Tyr 
20 60 65 70 

ACT GTG CCT TTA GAT GAG GGG ATG TAT CCT TTG GAA GGA TCA CGG AGC 353 
Thr Val Pro Leu Asp Glu Gly Met Tyr Pro Leu Glu Gly Ser Arg Ser 
75 80 85 

25 

TAT TGT CTG GAC AGC TCT TCT CCC ATG GAA GTC TCT GCG GTT CCT CCT 401 
Tyr Cys Leu Asp Ser Ser Ser Pro Met Glu Val Ser Ala Val Pro Pro 
90 95 100 105 

30 CAA GTG GGA GGG CGC GCT TTC CCC GAG GAT GAG AGT CAG GTA GAC CAG 449 
Gin Val Gly Gly Arg Ala Phe Pro Glu Asp Glu Ser Gin Val Asp Gin 

110 115 120 

GAC CTA GTT GTC GCC CCA GAG ATC TTC GTG GAT CAG TCC GTG AAT GGC 497 
35 Asp Leu Val Val Ala Pro Glu lie Phe Val Asp Gin Ser Val Asn Gly 

125 130 135 

TTG TTG ATT GGC ACC ACG GGA GTC ATG TTG CAG AGC CCG AGA GCG GGT 545 
Leu Leu lie Gly Thr Thr Gly Val Met Leu Gin Ser Pro Arg Ala Gly 
40 140 145 150 

CAC GAT GAT GTC CCT CCA CTC TCA CCA TTG CTA CCT CCA ATG CAG AAT 593 

His Asp Asp Val Pro Pro Leu Ser Pro Leu Leu Pro Pro Met Gin Asn 

155 160 165 

45 

AAT CAA ATC CAA AGG AAC TTC AGT GGA CTC ACT GGC ACA GAA GCC CAC 641 

Asn Gin lie Gin Arg Asn Phe Ser Gly Leu Thr Gly Thr Glu Ala His 
170 175 180 185 

50 GTG GCT GAA AGT ATG CGC TGT CAT TTG AAT TTT GAT CCG AAC TCT GCT 689 
Val Ala Glu Ser Met Arg Cys His Leu Asn Phe Asp Pro Asn Ser Ala 

190 195 200 

CCT GGG GTT GGA AGA GTT TAT GAC TCA GTG CAA AGT AGT GGT CCC ATG 737 
55 Pro Gly Val Ala Arg Val Tyr Asp Ser Val Gin Ser Ser Gly Pro Met 

205 210 215 

GTT GTG ACA AGC CTT ACA GAG GAG CTG AAA AAA CTT GCA AAG CAA GGA 785 
Val Val Thr Ser Leu Thr Glu Glu Leu Lys Lys Leu Ala Lys Gin Gly 
60 220 225 230 

TGG TAC TGG GGA CCA ATC ACA CGT TGG GAG GCA GAA GGG AAG CTA GCA 833 
Trp Tyr Trp Gly Pro He Thr Arg Trp Glu Ala Glu Gly Lys Leu Ala 
235 " 240 245 
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AAC GTG CCA GAT GGT TCT TTT CTT GTT CGG GAC AGT TCT GAC GAC CGT 881 

Asn Val Pro Asp Gly Ser Phe Leu Val Arg Asp Ser Ser Asp Asp Arg 

250 255 260 265 

5 

TAC CTT TTA AGC TTG AGC TTT CGC TCC CAT GGT AAA ACA CTT CAC ACT 929 

Tyr Leu Leu Ser Leu Ser Phe Arg Ser His Gly Lys Thr Leu His Thr 

270 275 280 

10 AGA ATT GAG CAC TCA AAT GGT AGG TTT AGC TTT TAT GAA CAG CCA GAT 977 
Arg lie Glu His Ser Asn Gly Arg Phe Ser Phe Tyr Glu Gin Pro Asp 

285 290 295 

GTG GAA AGG ACA TAC TCC ATA GTT GAT CTA ATT GAG CAT TCC ATC CAG 1025 
15 Val Glu Arg Thr Tyr Ser lie Val Asp Leu lie Glu His Ser He Gin 

300 305 310 

GGA CTC GAA AAT GGA GCT TTT TGT TAT TCA AGG TCT CGG CTG CCT GGA 1073 
Gly Leu Glu Asn Gly Ala Phe Cys Tyr Ser Arg Ser Arg Leu Pro Gly 
20 315 320 325 

TCT GCA ACT TAC CCC GTC AGA CTG ACC AAC CCA GTG TCC CGG TTC ATG 1121 
Ser Ala Thr Tyr Pro Val Arg Leu Thr Asn Pro Val Ser Arg Phe Met 
330 335 340 345 

25 

CAG GTG CGC TCG TTG CAG TAC CTG TGT CGT TTT GTT ATA CGT CAG TAT 1169 
Gin Val Arg Ser Leu Gin Tyr Leu Cys Arg Phe Val He Arg Gin Tyr 

350 355 360 

30 ACC AGA ATA GAC TTA ATT CAG AAA CTG CCT TTG CCA AAC AAA ATG AAG 1217 
Thr Arg He Asp Leu He Gin Lys Leu Pro Leu Pro Asn Lys Met Lys 

365 370 375 

GAT TAT TTA CAG GAG AAG CAC TAC TGAAAGATTG AGAACCCTGC ATCTTGCACT 1271 
35 Asp Tyr Leu Gin Glu Lys His Tyr 

380 385 

TTGGGAATAA GAACAAGAGA TTGAAATACA GTTTACAAAC TTTCATTGCC ATCAAAATCT * 1331 

40 TTTGCTGCCA TAACTATTTC AGTTTTATGT GTAAAAGAGT CATCAGTTTG TTTAGGGGTG 1391 

GGGAAGTGTC AGCAAGGTGT CTTGGGTTTA TTTTGGTTCT TTAAAAAAGG GAAGTCTTGA 1451 

AGTTTTAGAA GTGTTGAATT ATGTTTCATC AATGTGCAGA ATAATCACAA TGTGAATTAT 1511 

45 

CAAATTCTCC TCAATGCCCC CCCCGCCCAT TCCTTTGCTG CTATCCACTG TGATTTTTAT 1571 

GCATTAAAAG CCCATTTCAT GTTTTTTCAA CCCTAAGTAA AGTTGAATGA AACTTAACAG 1631 

50 AATGGAAATT GCTATTTCTT TTTAAATGGC CCATTTTCCA AAACAAGTGT TGAATAACCA 1691 

ACCCTGTTTG AATAAAACCC GAAATTACCA ATAACACCGG AGGTGAGTTT TTAATCTCCT 1751 

ACCTTGAAAA GATTTATTTA GAATCGGGAA TTGACCTAAT ATTGGGTAAT TGGACCGGAG 1811 

55 

ATCTGCAACA TATTCTTTAA CAACAATTTA TTGGCCTTAA TTTGTTTCCA AAGGTGGCCT 1871 

TATTTCTTTG GGGGGGGAAA GGAGGAATTC TCCGTCCCCC TCGTTTTCAT CTTCTAGTTT 1931 

60 GTGCTATTTT AATAAATGGC CTTACATTAA AAAATTGTAA AGAAATGTAT ACCACCAATT 1991 

TAGAAATTGT TGCCTTTTCT GTAATTAAAC TCGGGTACAA ATCGGCATAA CATGAAAACC 2051 

TATGGAACTA GAATTATTAT TAAAGAAATA TTAGATGATC AT 2093 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 385 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Glu Val Arg Val Lys Ala Leu Val His Ser Ser Ser Pro Ser Pro 
15 10 15 

Ala Leu Asn Gly Val Arg Lys Asp Phe His Asp Leu Gin Ser Glu Thr 

20 25 30 

Thr Cys Gin Glu Gin Ala Asn Ser Leu Lys Ser Ser Ala Ser His Asn 
35 40 45 

Gly Asp Leu His Leu His Leu Asp Glu His Val Pro Val Val lie Gly 
50 55 60 

Leu Met Pro Gin Asp Tyr lie Gin Tyr Thr Val Pro Leu Asp Glu Gly 
65 70 75 80 

Met Tyr Pro Leu Glu Gly Ser Arg Ser Tyr Cys Leu Asp Ser Ser Ser 

85 90 95 

Pro Met Glu Val Ser Ala Val Pro Pro Gin Val Gly Gly Arg 'Ala Phe 

100 105 110 

Pro Glu Asp Glu Ser Gin Val Asp Gin Asp Leu Val Val Ala Pro Glu 
115 120 125 

lie Phe Val Asp Gin Ser Val Asn Gly Leu Leu lie Gly Thr Thr Gly 
130 135 140 

Val Met Leu Gin Ser Pro Arg Ala Gly His Asp Asp Val Pro Pro Leu 
145 150 155 160 

Ser Pro Leu Leu Pro Pro Met Gin Asn Asn Gin lie Gin Arg Asn Phe 

165 170 175 

Ser Gly Leu Thr Gly Thr Glu Ala His Val Ala Glu Ser Met Arg Cys 

180 185 190 

His Leu Asn Phe Asp Pro Asn Ser Ala Pro Gly Val Ala Arg Val Tyr 
195 200 205 

Asp Ser Val Gin Ser Ser Gly Pro Met Val Val Thr Ser Leu Thr Glu 
210 215 220 

Glu Leu Lys Lys Leu Ala Lys Gin Gly Trp Tyr Trp Gly Pro lie Thr 
225 230 235 240 

Arg Trp Glu Ala Glu Gly Lys Leu Ala Asn Val Pro Asp Gly Ser Phe 

245 250 255 

Leu Val Arg Asp Ser Ser Asp Asp Arg Tyr Leu Leu Ser Leu Ser Phe 

260 265 270 
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Arg Ser His Gly Lys Thr Leu His Thr Arg lie Glu His Ser Asn Gly 
275 280 285 

Arg Phe Ser Phe Tyr Glu Gin Pro Asp Val Glu Arg Thr Tyr Ser lie 
5 290 295 300 

Val Asp Leu He Glu His Ser He Gin Gly Leu Glu Asn Gly Ala Phe 
305 310 315 320 

10 Cys Tyr Ser Arg Ser Arg Leu Pro Gly Ser Ala Thr Tyr Pro Val Arg 

325 330 335 



15 



35 



Leu Thr Asn Pro Val Ser Arg Phe Met Gin Val Arg Ser Leu Gin Tyr 

340 345 350 

Leu Cys Arg Phe Val He Arg Gin Tyr Thr Arg He Asp Leu He Gin 
355 360 365 



Lys Leu Pro Leu Pro Asn Lys Met Lys Asp Tyr Leu Gin Glu Lys His 
20 370 375 380 

Tyr 
385 

25 (2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1748 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..1335 

40 (ix) FEATURE: 

(A) NAME/ KEY: mis cofeature 

(B) LOCATION: 1026 

(D) OTHER INFORMATION: /note= "Nucleotide may be C or T at 
positions: 1026, 1032, 1041, 1452, 1510, and 1567." 

45 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 945 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or G at 
50 positions: 945, 1376, 1541, 1658, 1662, and 1668." 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1435 

55 (D) OTHER INFORMATION: /note= "Nucleotide may be G or T at 

positions: 1435, 1481, 1518, and 1543." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 
60 (B) LOCATION: 1500 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or C at 
positions: 1500, and 1669." 

(ix) FEATURE: 
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(A) NAME/ KEY: mis cofeature 

(B) LOCATION: 1521 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or T at 
positions: 1521, and 1542 . " 

5 

(ix) FEATURE: 

(A) NAME /KEY: misc_feature 

(B) LOCATION: 1651 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, or 
10 T at position 1651." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: 1654 

15 (D) OTHER INFORMATION: /note= "Nucleotide may be G, T, or 

C at position 1654." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 
20 (B) LOCATION: 1656 

(D) OTHER INFORMATION: /note= "Nucleotide may be G, C, or 
A at position 1656." 

(ix) FEATURE: 
25 (A) NAME/KEY: misc_feature 

(B) LOCATION: 1589.. 1649 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
or G at positions: 1589-1649, 1652, 1655, 1657-1661, 1664-1667, 
and 1672-1748." 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



ATG GAG GCC GGA GAG GAA CCG CTG CTG CTG GCC GAA CTC AAG CCC GGG 48 
35 Met Glu Ala Gly Glu Glu Pro Leu Leu Leu Ala Glu Leu Lys Pro Gly 
15 10 15 

CGC CCC CAC CAG TTT GAT TGG AAG TCC AGC TGT GAA ACC TGG AGC GTG 96 
Arg Pro His Gin Phe Asp Trp Lys Ser Ser Cys Glu Thr Trp Ser Val 
40 20 25 30 

GCC TTC TCG CCA GAC GGT TCC TGG TTC GCC TGG TCT CAA GGA CAC TGC 144 
Ala Phe Ser Pro Asp Gly Ser Trp Phe Ala Trp Ser Gin Gly His Cys 
35 40 45 

45 

GTG GTC AAG CTG GTC CCC TGG CCC TTA GAG GAA CAG TTC ATC CCT AAA 192 
Val Val Lys Leu Val Pro Trp Pro Leu Glu Glu Gin Phe lie Pro Lys 
50 55 60 

50 GGA TTC GAA GCC AAG AGC CGA AGC AGC AAG AAT GAC CCA AAA GGA CGG 240 
Gly Phe Glu Ala Lys Ser Arg Ser Ser Lys Asn Asp Pro Lys Gly Arg 
65 70 75 80 

GGC AGT CTG AAG GAG AAG ACG CTG GAC TGT GGC CAG ATT GTG TGG GGG 288 
55 Gly Ser Leu Lys Glu Lys Thr Leu Asp Cys Gly Gin lie Val Trp Gly 

85 90 95 

CTG GCC TTC AGC CCA TGG CCC TCT CCA CCC AGC AGG AAA CTC TGG GCA 336 
Leu Ala Phe Ser Pro Trp Pro Ser Pro Pro Ser Arg Lys Leu Trp Ala 
60 100 105 110 

CGT CAC CAT CCC CAG GCG CCT GAT GTT TCT TGC CTG ATC CTG GCC ACA 384 
Arg His His Pro Gin Ala Pro Asp Val Ser Cys Leu lie Leu Ala Thr 
115 120 125 
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GGT CTC AAC GAT GGG CAG ATC AAG ATT TGG GAG GTA CAG ACA GGC CTC 432 
Gly Leu Asn Asp Gly Gin lie Lys lie Trp Glu Val Gin Thr Gly Leu 
130 135 140 

5 

CTG CTT CTG AAT CTT TCT GGC CAC CAA GAC GTC GTG AGA GAT CTG AGC 480 
Leu Leu Leu Asn Leu Ser Gly His Gin Asp Val Val Arg Asp Leu Ser 
145 150 155 160 

10 TTC ACG CCC AGC GGC AGT TTG ATT TTG GTC TCT GCA TCC CGG GAT AAG 528 
Phe Thr Pro Ser Gly Ser Leu lie Leu Val Ser Ala Ser Arg Asp Lys 

165 170 175 

ACA CTT CGA ATT TGG GAC CTG AAT AAG CAC GGT AAG CAG ATC CAG GTG 576 
15 Thr Leu Arg lie Trp Asp Leu Asn Lys His Gly Lys Gin lie Gin Val 

180 185 190 

TTA TCC GGC CAT CTG CAG TGG GTT TAC TGC TGC TCC ATC TCC CCT GAC 624 
Leu Ser Gly His Leu Gin Trp Val Tyr Cys Cys Ser lie Ser Pro Asp 
20 195 200 205 

TGT AGC ATG CTG TGC TCT GCA GCT GGG GAG AAG TCG GTC TTT CTG TGG 672 
Cys Ser Met Leu Cys Ser Ala Ala Gly Glu Lys Ser Val Phe Leu Trp 
210 215 220 

25 

AGC ATG CGG TCC TAC ACA CTA ATC CGG AAA CTA GAA GGC CAC CAA AGC 720 
Ser Met Arg Ser Tyr Thr Leu lie Arg Lys Leu Glu Gly His Gin Ser 
225 230 235 240 

30 AGT GTT GTC TCC TGT GAT TTC TCT CCT GAT TCA GCC TTG CTT GTC ACA 768 
Ser Val Val Ser Cys Asp Phe Ser Pro Asp Ser Ala Leu Leu Val Thr 

245 250 255 

GCT TCG TAT GAC ACC AGT GTG ATT ATG TGG GAC CCC TAC ACC GGC GAG 816 
35 Ala Ser Tyr Asp Thr Ser Val lie Met Trp Asp Pro Tyr Thr Gly Glu 

260 265 270 

AGG CTG AGG TCA CTT CAT CAC ACA CAG CTT GAA CCC ACC ATG GAT GAC 864 
Arg Leu Arg Ser Leu His His Thr Gin Leu Glu Pro Thr Met Asp Asp 
40 275 280 285 

AGT GAC GTC CAC ATG AGC TCC CTG AGG TCC GTG TGC TTC TCA CCT GAA 912 

Ser Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ser Pro Glu 

290 295 300 

45 

GGC TTG TAT CTC GCT ACG GTG GCA GAT GAC AGA CTG CTC AGG ATC TGG 960 

Gly Leu Tyr Leu Ala Thr Val Ala Asp Asp Arg Leu Leu Arg lie Trp 

305 310 315 320 

50 GCT CTG GAA CTG AAA GCT CCG GTT GCC TTT GCT CCG ATG ACC AAT GGT 1008 
Ala Leu Glu Leu Lys Ala Pro Val Ala Phe Ala Pro Met Thr Asn Gly 

325 330 335 

CTT TGC TGC ACA TTT TTC CCA CAC GGT GGA ATC ATT GCC ACA GGG ACA 1056 
55 Leu Cys Cys Thr Phe Phe Pro His Gly Gly lie lie Ala Thr Gly Thr 

340 345 350 

AGA GAT GGC CAC GTC CAG TTC TGG ACA GCT CCT AGG GTC CTG TCC TCA 1104 
Arg Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser Ser 
60 355 360 365 

CTG AAG CAC TTA TGC CGG AAA GCC CTT CGA AGT TTC CTA ACA ACT TAC 1152 
Leu Lys His Leu Cys Arg Lys Ala Leu Arg Ser Phe Leu Thr Thr Tyr 
370 375 380 
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CAA GTC CTA GCA CTG CCA ATC CCC AAG AAA ATG AAA GAG TTC CTC ACA 1200 

Gin Val Leu Ala Leu Pro He Pro Lys Lys Met Lys Glu Phe Leu Thr 
385 390 395 400 

5 

TAG AGG ACT TTT TAA GCA ACA CCA CAT CTT GTG CTT CTT TGT AGC AGG 1248 

Tyr Arg Thr Phe * Ala Thr Pro His Leu Val Leu Leu Cys Ser Arg 

405 410 415 

10 GTA AAT CGT CCT GTC AAA GGG AGT TGC TGG AAT AAT GGG CCA AAC ATC 1296 
Val Asn Arg Pro Val Lys Gly Ser Cys Trp Asn Asn Gly Pro Asn He 

420 425 430 

TGG TCT TGC ATT GAA ATA GCA TTT CTT TGG GAT TGT GAA TAGAATGTAG 1345 
15 Trp Ser Cys He Glu He Ala Phe Leu Trp Asp Cys Glu 

435 440 445 

CAAAACCAGA TTCCAGTGTA CTAGTCATGG GTCTTTCTCT CCCTGGGCAT GTGGAAAGTC 1405 

20 AGTCTTAGGA GGGAAGGAGA TTCCACTTGG CACGGGCAAC AGAGCCCTTA CGTTTAAATT 1465 

TTTCAGTCCA GTTATTGAAC AGCAAGTGTT TGAAATCTTT CTGGCTTGTT TTGGATTTCA 1525 

AAGTGGCAGT TACTGGTGGT TGTTTTTGGA TTTATGGCAA CCAAGTTAGG GCCTCCAGCG 1585 

GTTCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC 1645 

CCCCTCCACC CCGCCCATCC CCACATCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC 1705 

30 CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCC 1748 

(2) INFORMATION FOR SEQ ID NO: 8: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 445 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Glu Ala Gly Glu Glu Pro Leu Leu Leu Ala Glu Leu Lys Pro Gly 
45 1 5 10 15 

Arg Pro His Gin Phe Asp Trp Lys Ser Ser Cys Glu Thr Trp Ser Val 

20 25 30 

50 Ala Phe Ser Pro Asp Gly Ser Trp Phe Ala Trp Ser Gin Gly His Cys 

35 40 45 

Val Val Lys Leu Val Pro Trp Pro Leu Glu Glu Gin Phe He Pro Lys 
50 55 60 

Gly Phe Glu Ala Lys Ser Arg Ser Ser Lys Asn Asp Pro Lys Gly Arg 
65 70 75 80 



55 



Gly Ser Leu Lys Glu Lys Thr Leu Asp Cys Gly Gin He Val Trp Gly 
60 85 90 95 

Leu Ala Phe Ser Pro Trp Pro Ser Pro Pro Ser Arg Lys Leu Trp Ala 

100 105 110 
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Arg His His Pro Gin Ala Pro Asp Val Ser Cys Leu lie Leu Ala Thr 
115 120 125 

Gly Leu Asn Asp Gly Gin lie Lys lie Trp Glu Val Gin Thr Gly Leu 
5 130 135 140 

Leu Leu Leu Asn Leu Ser Gly His Gin Asp Val Val Arg Asp Leu Ser 
145 150 155 160 

10 Phe Thr Pro Ser Gly Ser Leu lie Leu Val Ser Ala Ser Arg Asp Lys 

165 170 175 
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30 



45 



60 



Thr Leu Arg lie Trp Asp Leu Asn Lys His Gly Lys Gin lie Gin Val 

180 185 190 

Leu Ser Gly His Leu Gin Trp Val Tyr Cys Cys Ser lie Ser Pro Asp 
195 200 205 



Cys Ser Met Leu Cys Ser Ala Ala Gly Glu Lys Ser Val Phe Leu Trp 
20 210 215 220 

Ser Met Arg Ser Tyr Thr Leu lie Arg Lys Leu Glu Gly His Gin Ser 
225 230 235 240 

25 Ser Val Val Ser Cys Asp Phe Ser Pro Asp Ser Ala Leu Leu Val Thr 

245 250 255 



Ala Ser Tyr Asp Thr Ser Val lie Met Trp Asp Pro Tyr Thr Gly Glu 

260 265 270 

Arg Leu Arg Ser Leu His His Thr Gin Leu Glu Pro Thr Met Asp Asp 
275 280 285 



Ser Asp Val His Met Ser Ser Leu Arg Ser Val Cys Phe Ser Pro Glu 
35 290 295 300 

Gly Leu Tyr Leu Ala Thr Val Ala Asp Asp Arg Leu Leu Arg lie Trp 
305 310 315 320 

40 Ala Leu Glu Leu Lys Ala Pro Val Ala Phe Ala Pro Met Thr Asn Gly 

325 330 335 



Leu Cys Cys Thr Phe Phe Pro His Gly Gly lie lie Ala Thr Gly Thr 

340 345 350 

Arg Asp Gly His Val Gin Phe Trp Thr Ala Pro Arg Val Leu Ser Ser 
355 360 365 



Leu Lys His Leu Cys Arg Lys Ala Leu Arg Ser Phe Leu Thr Thr Tyr 

50 370 375 380 

Gin Val Leu Ala Leu Pro lie Pro Lys Lys Met Lys Glu Phe Leu Thr 
385 390 395 400 

55 Tyr Arg Thr Phe * Ala Thr Pro His Leu Val Leu Leu Cys Ser Arg 

405 410 415 



Val Asn Arg Pro Val Lys Gly Ser Cys Trp Asn Asn Gly Pro Asn lie 

420 425 430 

Trp Ser Cys lie Glu lie Ala Phe Leu Trp Asp Cys Glu 
435 440 445 

(2) INFORMATION FOR SEQ ID NO: 9: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2198 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1. .1419 

15 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 1680 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
or G at positions: 1680, 1691, 1696, 1704, 1707, 1728, 1740, 
20 1743, 1746, 1755, 1760, 1770, 1773, 1802, 1816, 1817, 1823, 

1826, 1827, 1846, 1851, 1857, 1861, 1880, and 1885." 

(ix) FEATURE: 

(A) NAME /KEY: misc_feature 
25 (B) LOCATION: 1909 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
or G at positions: 1909, 1917, 1920, 1929, 1946, 1953, 1967-8, 
1980, 1991, 1995, 2001, 2004, 2021, 2033-37, 2039-40, 2042, 
2048, 2051, 2054, 2061, 2075, 2081, and 2083-85. " 

30 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 2088 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
35 or G at positions: 2088, 2105, 2121, 2124, 2132, 2137, 2147, 

2149, 2151-52, 2160, 2165, 2177, 2179 and 2196." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 
40 (B) LOCATION: 494 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or C at 
position 494 . " 

(ix) FEATURE: 
45 (A) NAME/KEY: misc_feature 

(B) LOCATION: 498 

(D) OTHER INFORMATION: /note= "Nucleotide may be C or T at 
positions: 498, 501, 1455, 1524, 1527, 1621, 1829, and 2072." 

50 (ix) FEATURE: 

(A) NAME/KEY: misc_£eature 

(B) LOCATION: 499 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or C at 
positions: 499, 1618, and 1664." 

55 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: 1673 

(D) OTHER INFORMATION: /note- "Nucleotide may be G or T at 
60 position 1673." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: 1819 
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(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, or 
G at positions: 1819, 1840, and 2089." 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGC GGT GGT GAT GGC GGC AGG CGC TCG GAC AGC TCC GCT TGA GCT GAG 48 
Gly Gly Gly Asp Gly Gly Arg Arg Ser Asp Ser Ser Ala * Ala Glu 
15 10 15 

CTC GGA GAG ATC CGT CCA GAA AGT GCC CAG AAG AAA CTT CCT CTT AGA 96 
Leu Gly Glu lie Arg Pro Glu Ser Ala Gin Lys Lys Leu Pro Leu Arg 

20 25 30 

15 AAA GCT GAA AAC ACA ATA TTT ATA ACA CTG GAA ATT GTA AAG AAT TTG 144 
Lys Ala Glu Asn Thr lie Phe lie Thr Leu Glu lie Val Lys Asn Leu 
35 40 45 

TTT AAA ATG GCT GAA AAC AAT AGT AAA AAT GTA GAT GTA CGG CCT AAA 192 
20 Phe Lys Met Ala Glu Asn Asn Ser Lys Asn Val Asp Val Arg Pro Lys 

50 55 60 

ACA AGT CGG AGT CGA AGT GCT GAC AGG AAG GAT GGT TAT GTG TGG AGT 240 
Thr Ser Arg Ser Arg Ser Ala Asp Arg Lys Asp Gly Tyr Val Trp Ser 
25 65 70 75 80 

GGA AAG AAG TTG TCT TGG TCC AAA AAG AGT GAG AGT TGT TCT GAA TCT 288 
Gly Lys Lys Leu Ser Trp Ser Lys Lys Ser Glu Ser Cys Ser Glu Ser 

85 90 95 

30 

GAA GCC AAG AAA GGG CAG CTT AGC TGT TCG TCC ATT GAG TTG GAC TTA 336 
Glu Ala Lys Lys Gly Gin Leu Ser Cys Ser Ser lie Glu Leu Asp Leu 

100 105 110 

35 GAT CAT TCC TGT GGG CAT AGA TTT TTA GGC CGA TCC CTT AAA CAG AAA 384 
Asp His Ser Cys Gly His Arg Phe Leu Gly Arg Ser Leu Lys Gin Lys 
115 120 125 

CTG CAA GAT GCG GTG GGG CAG TGT TTT CCA ATA AAG AAT TGT AGT GGC 432 
40 Leu Gin Asp Ala Val Gly Gin Cys Phe Pro lie Lys Asn Cys Ser Gly 
130 135 140 

CGA CAC TCT CCA GGG CTT CCA TCT AAA AGA AAG ATT CAT ATC AGT GAA 480 
Arg His Ser Pro Gly Leu Pro Ser Lys Arg Lys lie His lie Ser Glu 
45 145 150 155 160 

CTC ATG TTA GAT ACG TGC CCC TTC CCA CCT CGC TCA GAT TTA GCC TTT 528 
Leu Met Leu Asp Thr Cys Pro Phe Pro Pro Arg Ser Asp Leu Ala Phe 

165 170 175 

50 

AGG TGG CAT TTT ATT AAA CGA CAC ACT GTT CCT ATG AGT CCC AAC TCA 576 
Arg Trp His Phe lie Lys Arg His Thr Val Pro Met Ser Pro Asn Ser 

180 185 190 

55 GAT GAA TGG GTG AGT GGA GAC CTG TCT GAG AGG AAA CTG AGA GAT GCT 624 
Asp Glu Trp Val Ser Ala Asp Leu Ser Glu Arg Lys Leu Arg Asp Ala 
195 200 205 

CAG CTG AAA CGA AGA AAC ACA GAA GAT GAC ATA CCC TGT TTC TCA CAT 672 
60 Gin Leu Lys Arg Arg Asn Thr Glu Asp Asp lie Pro Cys Phe Ser His 
210 215 220 

ACC AAT GGC CAG CCT TGT GTC ATA ACT GCC AAC AGT GCT TCG TGT ACA 720 
Thr Asn Gly Gin Pro Cys Val lie Thr Ala Asn Ser Ala Ser Cys Thr 
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225 230 235 240 

GGT GGT CAC ATA ACT GGT TCT ATG ATG AAC TTG GTC ACA AAC AAC AGC 768 
Gly Gly His lie Thr Gly Ser Met Met Asn Leu Val Thr Asn Asn Ser 
5 245 250 255 

ATA GAA GAC AGT GAC ATG GAT TCA GAG GAT GAA ATT ATA ACG CTG TGC 816 
He Glu Asp Ser Asp Met Asp Ser Glu Asp Glu He He Thr Leu Cys 

260 265 270 

10 

ACA AGC TCC AGA AAA AGG AAT AAG CCC AGG TGG GAA ATG GAA GAG GAG 864 
Thr Ser Ser Arg Lys Arg Asn Lys Pro Arg Trp Glu Met Glu Glu Glu 
275 280 285 

15 ATC CTG CAG TTG GAG GCA CCT CCT AAG TTC CAC ACC CAG ATC GAC TAC 912 
He Leu Gin Leu Glu Ala Pro Pro Lys Phe His Thr Gin He Asp Tyr 
290 295 300 

GTC CAC TGC CTT GTT CCA GAC CTC CTT CAG ATC AGT AAC AAT CCG TGC 960 
20 Val His Cys Leu Val Pro Asp Leu Leu Gin He Ser Asn Asn Pro Cys 
305 310 315 320 

TAC TGG GGT GTC ATG GAC AAA TAT GCA GCC GAA GCT CTG CTG GAA GGA 1008 
Tyr Trp Gly Val Met Asp Lys Tyr Ala Ala Glu Ala Leu Leu Glu Gly 
25 325 330 335 

AAG CCA GAG GGC ACC TTT TTA CTT CGA GAT TCA GCG CAG GAA GAT TAT 1056 
Lys Pro Glu Gly Thr Phe Leu Leu Arg Asp Ser Ala Gin Glu Asp Tyr 

340 345 350 

30 

TTA TTC TCT GTT AGT TTT AGA CGC TAC AGT CGT TCT CTT CAT GCT AGA 1104 
Leu Phe Ser Val Ser Phe Arg Arg Tyr Ser Arg Ser Leu His Ala Arg 
355 360 365 

35 ATT GAG CAG TGG AAT CAT AAC TTT AGC TTT GAT GCC CAT GAT CCT TGT 1152 
He Glu Gin Trp Asn His Asn Phe Ser Phe Asp Ala His Asp Pro Cys 
370 375 380 

GTC TTC CAT TCT CCT GAT ATT ACT GGG CTC CTG GAA CAC TAT AAG GAC 1200 
40 Val Phe His Ser Pro Asp He Thr Gly Leu Leu Glu His Tyr Lys Asp 
385 390 395 400 

CCC AGT GCC TGT ATG TTC TTT GAG CCG CTC TTG TCC ACT CCC TTA ATC 1248 
Pro Ser Ala Cys Met Phe Phe Glu Pro Leu Leu Ser Thr Pro Leu He 
45 405 410 415 

CGG ACG TTC CCC TTT TCC TTG CAG CAT ATT TGC AGA ACG GTT ATT TGT 1296 
Arg Thr Phe Pro Phe Ser Leu Gin His He Cys Arg Thr Val He Cys 

420 425 430 

50 

AAT TGT ACG ACT TAC GAT GGC ATC GAT GCC CTT CCC ATT CCT TCG CCT 1344 
Asn Cys Thr Thr Tyr Asp Gly He Asp Ala Leu Pro He Pro Ser Pro 
435 440 445 

55 ATG AAA TTG TAT CTG AAG GAA TAC CAT TAT AAA TCA AAA GTT AGG TTA 1392 
Met Lys Leu Tyr Leu Lys Glu Tyr His Tyr Lys Ser Lys Val Arg Leu 
450 455 460 

CTC AGG ATT GAT GTG CCA GAG CAG CAG TGATGCGGAG AGGTTAGAAT 1439 
60 Leu Arg He Asp Val Pro Glu Gin Gin 
465 470 



GTCCACCGGA GCTTTTGTTC CCTTTAGTGA GGGTTAATTT CGAGCTTGGC GTAATCATGG 1499 
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TCATAGCTGT TTCCTGTGTG AAATTGTTAT CCGCTCACAA TTCCACACAA CATACGAGCC 1559 

GGAAGCATAA AGTGTAAAGC CTGGGGTGCC TAATGAGTGA GCTAACTCAC ATTAATTGGG 1619 

5 TCGCGCTCAC TGCCCGCTTT CCAGTCGGGA AACCTGTCGT GCCAGCTGCA TTACTGAATC 1679 

CGCCAACTCG CCGGGACAGC GGTTAGCCTA TTGGGCGCTC TTCACTTCCT CGCTCACTGA 1739 

CTCCCTCCCT CGGTCCTTCG CTGCTGCTAC CGTCTCCCCC ATCCAAGCGT TATACGCTAT 1799 

CCCCAGAACT GGGAAACCCC GAACACCCTC ACAAAGCTCA CTGCTACCGT ACACGCCCTG 1859 

CCGGCTTTTC CTCGTCCCCC CACACCCTAA ACAGCCCTCG AGTGCAACCC CGATATACAT 1919 

15 CTCTTCCCTC AACCCCTGCC TCTGTCCCCG CCTCCGACTT CGCTTCCCCG GATTGCTTTC 1979 

CCCCCGTAGT CCGTCCTAGT GCGCCGCGCC TTCCACCCTT CCACCCCTAC GTACCCCCAC 2039 

CCCCCAAACC CCCCCCCCCT CCGATAAAAA GTCAGCGCCT TCACCCCCCC GATAAAAATG 2099 

GTCCCCTACT TTCCAATGTC TCCCCCCCGG CTCTTCTCGC CACCCAACTC ACCTTTCCGG 2159 

CACTGCATCC GGTGCTACCC TCCTGTTTCT CCTCCCCCC 2198 



10 



20 



25 



45 



60 



(2) INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 473 amino acids 
30 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Gly Gly Gly Asp Gly Gly Arg Arg Ser Asp Ser Ser Ala * Ala Glu 
15 10 15 

40 Leu Gly Glu lie Arg Pro Glu Ser Ala Gin Lys Lys Leu Pro Leu Arg 

20 25 30 



Lys Ala Glu Asn Thr lie Phe lie Thr Leu Glu lie Val Lys Asn Leu 

35 40 45 

Phe Lys Met Ala Glu Asn Asn Ser Lys Asn Val Asp Val Arg Pro Lys 

50 55 60 



Thr Ser Arg Ser Arg Ser Ala Asp Arg Lys Asp Gly Tyr Val Trp Ser 
50 65 70 75 80 

Gly Lys Lys Leu Ser Trp Ser Lys Lys Ser Glu Ser Cys Ser Glu Ser 

85 90 95 

55 Glu Ala Lys Lys Gly Gin Leu Ser Cys Ser Ser lie Glu Leu Asp Leu 

100 105 110 



Asp His Ser Cys Gly His Arg Phe Leu Gly Arg Ser Leu Lys Gin Lys 
115 120 125 

Leu Gin Asp Ala Val Gly Gin Cys Phe Pro lie Lys Asn Cys Ser Gly 

130 135 140 

Arg His Ser Pro Gly Leu Pro Ser Lys Arg Lys lie His lie Ser Glu 



« 



20 



145 150 155 160 

Leu Met Leu Asp Thr Cys Pro Phe Pro Pro Arg Ser Asp Leu Ala Phe 

165 170 175 

Arg Trp His Phe He Lys Arg His Thr Val Pro Met Ser Pro Asn Ser 

180 185 190 

Asp Glu Trp Val Ser Ala Asp Leu Ser Glu Arg Lys Leu Arg Asp Ala 
195 200 205 

Gin Leu Lys Arg Arg Asn Thr Glu Asp Asp He Pro Cys Phe Ser His 
210 215 220 

Thr Asn Gly Gin Pro Cys Val He Thr Ala Asn Ser Ala Ser Cys Thr 
225 230 235 240 

Gly Gly His He Thr Gly Ser Met Met Asn Leu Val Thr Asn Asn Ser 

245 250 255 

He Glu Asp Ser Asp Met Asp Ser Glu Asp Glu He He Thr Leu Cys 

260 265 270 

Thr Ser Ser Arg Lys Arg Asn Lys Pro Arg Trp Glu Met Glu Glu Glu 
275 280 285 

He Leu Gin Leu Glu Ala Pro Pro Lys Phe His Thr Gin He Asp Tyr 
290 295 300 

Val His Cys Leu Val Pro Asp Leu Leu Gin He Ser Asn Asn Pro Cys 
305 310 315 320 

Tyr Trp Gly Val Met Asp Lys Tyr Ala Ala Glu Ala Leu Leu Glu Gly 

325 330 335 

Lys Pro Glu Gly Thr Phe Leu Leu Arg Asp Ser Ala Gin Glu Asp Tyr 

340 345 350 

Leu Phe Ser Val Ser Phe Arg Arg Tyr Ser Arg Ser Leu His Ala Arg 
355 360 365 

He Glu Gin Trp Asn His Asn Phe Ser Phe Asp Ala His Asp Pro Cys 
370 375 380 

Val Phe His Ser Pro Asp He Thr Gly Leu Leu Glu His Tyr Lys Asp 
385 390 395 400 

Pro Ser Ala Cys Met Phe Phe Glu Pro Leu Leu Ser Thr Pro Leu He 

405 410 415 

Arg Thr Phe Pro Phe Ser Leu Gin His He Cys Arg Thr Val He Cys 

420 425 430 

Asn Cys Thr Thr Tyr Asp Gly He Asp Ala Leu Pro He Pro Ser Pro 
435 440 445 

Met Lys Leu Tyr Leu Lys Glu Tyr His Tyr Lys Ser Lys Val Arg Leu 
450 455 460 

Leu Arg He Asp Val Pro Glu Gin Gin 
465 470 

(2) INFORMATION FOR SEQ ID NO: 11: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 117.. 1724 

(ix) FEATURE: 
15 (A) NAME/KEY: misc_f eature 

(B) LOCATION: 740 

(D) OTHER INFORMATION : /note= "Nucleotide may be A or C at 
positions: 740, 797, 2139, and 2184." 

20 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 761 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or T at 
positions: 761, 1313, 1508, and 2226." 

25 

(ix) FEATURE: 

(A) NAME/ KEY: misc_f eature 

(B) LOCATION: 746 

(D) OTHER INFORMATION: /note= "Nucleotide may be C or T at 
30 positions 746, 1460, 1499, 2009, 2010, 2199, and 2225. ■ 

(ix) FEATURE: 

(A) NAME/ KEY: misc_f eature 

(B) LOCATION: 788 

35 (D) OTHER INFORMATION: /note= "Nucleotide may be A or G at 

positions 788, 863, 1550, 2178, 2188, 2197, and 2211." 

(ix) FEATURE: 

(A) NAME/ KEY: misc_f eature 
40 (B) LOCATION: 1163 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or C at 
positions: 1163, and 1544." 

(ix) FEATURE: 
45 (A) NAME/KEY: roisc_f eature 

(B) LOCATION: 2058 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or T at 
positions 2058, and 2128 



55 



50 (ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 2251 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
or G at position 2251." 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CTCGGGCCGG GATGGATCCG CCGGGAAGAG GAAGACAAGC GGAGCGTTGA GCCCCTGCGC 60 
ACGGTGCCCC GCGCGTAGTG GGAGCTTACT CGCAGTAGCT CTCGCTCTTC TAATCA .116 



ATG GAT AAA GTG GGG AAA ATG TGG AAC AAC TTA AAA TAC AGA TGC CAG 
Met Asp Lys Val Gly Lys Met Trp Asn Asn Leu Lys Tyr Arg Cys Gin 



164 
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1 5 10 15 

AAT CTC TTC AGC CAC GAG GGA GGA AGC CGT AAT GAG AAC GTG GAG ATG 212 
Asn Leu Phe Ser His Glu Gly Gly Ser Arg Asn Glu Asn Val Glu Met 
5 20 25 30 

AAC CCC AAC AGA TGT CCG TCT GTC AAA GAG AAA AGC ATC AGT CTG GGA 260 
Asn Pro Asn Arg Cys Pro Ser Val Lys Glu Lys Ser lie Ser Leu Gly 
35 40 45 

10 

GAG GCA GCT CCC CAG CAA GAG AGC AGT CCC TTA AGA GAA AAT GTT GCC 308 
Glu Ala Ala Pro Gin Gin Glu Ser Ser Pro Leu Arg Glu Asn Val Ala 
50 55 60 

15 TTA CAG CTG GGA CTG AGC CCT TCC AAG ACC TTT TCC AGG CGG AAC CAA 356 
Leu Gin Leu Gly Leu Ser Pro Ser Lys Thr Phe Ser Arg Arg Asn Gin 
65 70 75 80 

AAC TGT GCC GCA GAG ATC CCT CAA GTG GTT GAA ATC AGC ATC GAG AAA 404 
20 Asn Cys Ala Ala Glu lie Pro Gin Val Val Glu lie Ser lie Glu Lys 

85 90 95 

GAC AGT GAC TCG GGT GCC ACC CCA GGA ACG AGG CTT GCA CGG AGA GAC 452 
Asp Ser Asp Ser Gly Ala Thr Pro Gly Thr Arg Leu Ala Arg Arg Asp 
25 100 105 110 

TCC TAC TCG CGG CAC GCC CCG TGG GGA GGA AAG AAG AAA CAT TCC TGT 500 
Ser Tyr Ser Arg His Ala Pro Trp Gly Gly Lys Lys Lys His Ser Cys 
115 120 125 

30 

TCC ACA AAG ACC CAG AGT TCA TTG GAT ACC GAG AAA AAG TTT GGT AGA 548 
Ser Thr Lys Thr Gin Ser Ser Leu Asp Thr Glu Lys Lys Phe Gly Arg 
130 135 140 

35 ACT CGA AGC GGC CTT CAG AGG CGA GAG CGG CGC TAT GGA GTC AGC TCC 596 
Thr Arg Ser Gly Leu Gin Arg Arg Glu Arg Arg Tyr Gly Val Ser Ser 
145 150 155 160 

ATG CAG GAC ATG GAC AGC GTT TCT AGC CGC GCG GTC GGG AGC CGC TCC 644. 
40 Met Gin Asp Met Asp Ser Val Ser Ser Arg Ala Val Gly Ser Arg Ser 

165 170 175 

CTG AGG CAG AGG CTC CAG GAC ACG GTG GGT TTG TGT TTT CCC ATG AGA 692 
Leu Arg Gin Arg Leu Gin Asp Thr Val Gly Leu Cys Phe Pro Met Arg 
45 180 185 190 

ACT TAC AGC AAG CAG TCA AAG CCA CTC TTT TCC AAT AAA AGA AAA ATC 740 
Thr Tyr Ser Lys Gin Ser Lys Pro Leu Phe Ser Asn Lys Arg Lys lie 
195 200 205 

50 

CAT CTC TCT GAA TTA ATG CTG GAG AAA TGC CCT TTT CCT GCT GGC TCG 788 
His Leu Ser Glu Leu Met Leu Glu Lys Cys Pro Phe Pro Ala Gly Ser 
210 215 220 

55 GAT TTA GCC CAA AAG TGG CAT TTG ATT AAA CAG CAT ACA GCT CCT GTG 836 
Asp Leu Ala Gin Lys Trp His Leu lie Lys Gin His Thr Ala Pro Val 
225 .230 235 240 

AGC CCA CAT TCA ACA TTT TTT GAT ACG TTT GAT CCA TCT TTG GTT TCT 884 
60 Ser Pro His Ser Thr Phe Phe Asp Thr Phe Asp Pro Ser Leu Val Ser 

245 250 255 



ACA GAA GAT GAA GAA GAT AGG CTT AGA GAG AGA AGG CGG CTT AGT ATT 
Thr Glu Asp Glu Glu Asp Arg Leu Arg Glu Arg Arg Arg Leu Ser lie 



932 
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260 265 270 

GAA GAA GGG GTT GAT CCC CCT CCC AAT GCA CAA ATA CAT ACA TTT GAA 
Glu Glu Gly Val Asp Pro Pro Pro Asn Ala Gin lie His Thr Phe Glu 
275 280 285 

GCT ACT GCA CAG GTT AAT CCA TTA TTT AAA CTG GGA CCA AAA TTA GCT 
Ala Thr Ala Gin Val Asn Pro Leu Phe Lys Leu Gly Pro Lys Leu Ala 
290 295 300 

CCT GGA ATG ACT GAA ATA AGT GGG GAC AGT TCT GCA ATT CCA CAA GCT 
Pro Gly Met Thr Glu He Ser Gly Asp Ser Ser Ala He Pro Gin Ala 
305 310 315 320 

15 AAT TGT GAC TCG GAA GAG GAT ACA ACC ACC CTG TGT TTG CAG TCA CGG 
Asn Cys Asp Ser Glu Glu Asp Thr Thr Thr Leu Cys Leu Gin Ser Arg 

325 330 335 

AGG CAG AAG CAG CGT CAG ATA TCT GGA GAC AGC CAT ACC CAT GTT AGC 
20 Arg Gin Lys Gin Arg Gin He Ser Gly Asp Ser His Thr His Val Ser 

340 345 350 

AGA CAG GGA GCT TGG AAA GTC CAC ACA CAG ATT GAT TAC ATA CAC TGC 
Arg Gin Gly Ala Trp Lys Val His Thr Gin He Asp Tyr He His Cys 
25 355 360 365 

CTC GTG CCT GAT TTG CTT CAA ATT ACA GGG AAT CCC TGT TAC TGG GGA 
Leu Val Pro Asp Leu Leu Gin He Thr Gly Asn Pro Cys Tyr Trp Gly 
370 375 380 

30 

GTG ATG GAC CGT TAT GAA GCA GAA GCC CTC TCC GAA GGG AAA CCG GAA 
Val Met Asp Arg Tyr Glu Ala Glu Ala Leu Ser Glu Gly Lys Pro Glu 
385 390 395 400 

35 GGC ACG TTC TTG CTC AGG GAC TCT GCA CAG GAG GAC TAC CTC TTC TCT 
Gly Thr Phe Leu Leu Arg Asp Ser Ala Gin Glu Asp Tyr Leu Phe Ser 

405 410 415 

GTG AGT TCC GCC GCT ACA ACA GGA TCT CTG CAC GCC CGG ATC GAG CAG 
40 Val Ser Ser Ala Ala Thr Thr Gly Ser Leu His Ala Arg He Glu Gin 

420 425 430 

TGG AAC CAC AAC TTC AGC TTC GAT GCC CAT GAC CCC TGC GTG TTT CAC 
Trp Asn His Asn Phe Ser Phe Asp Ala His Asp Pro Cys Val Phe His 
45 435 440 445 

TCC TCC ACT GTC ACG GGG CTT CTC GAA CAC TAT AAA GAC CCC AGT TCG 
Ser Ser Thr Val Thr Gly Leu Leu Glu His Tyr Lys Asp Pro Ser Ser 
450 455 460 



50 



TGC ATG TTT TTT GAA CCG TTG CTA ACG ATA TCA CTC AAT AGG ACT TTC 
Cys Met Phe Phe Glu Pro Leu Leu Thr He Ser Leu Asn Arg Thr Phe 
465 470 475 480 



55 CCT TTC AGC CTG CAG TAT ATC TGC CGC GCA GTG ATC TGC AGA TGC ACT 
Pro Phe Ser Leu Gin Tyr He Cys Arg Ala Val He Cys Arg Cys Thr 

485 490 495 

ACG TAT GAT GGG ATT GAC GGG CTC CCG CTA CCG TCG ATG TTA CAG GAT 
60 Thr Tyr Asp Gly He Asp Gly Leu Pro Leu Pro Ser Met Leu Gin Asp 

500 505 510 



TTT TTA AAA GAG TAT CAT TAT AAA CAA AAA GTT AGA GTT CGC TGG TTG 
Phe Leu Lys Glu Tyr His Tyr Lys Gin Lys Val Arg Val Arg Trp Leu 
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515 520 525 

GAA CGA GAA CCA GTC AAG GCA AAG TAAACTCTCC GGTCCCCAAA GGGTGTTAAC 1754 
Glu Arg Glu Pro Val Lys Ala Lys 
5 530 535 

TAGGTCCGCT TTCATGTGCA TCAGACAGTA CACCTATAGC AAGCACACGT AGCAGTGTTA 1814 

GGCTTTTTCA TACAGTATGT AAGCTTAGTG TTAGTATCTG TCAGATGCTA CCTGCTGTTA 1874 

CTTATTCAGA TAAACATGGT GCCTATTGGA ACAATAGCGG ATAGAGCTAC AGGTGTTCAG 1934 

TAAGACTACA AAAACATTTT GCCTATTTCG CTAACAGTTT GGTTTTTAAT GGCTGTGGTA 1994 

15 TTTGAGTGAG GCAACCCTGG GGCATTTGTT ATGAAGAATT CTATTTCTTA CTGAAGAACA 2054 

AATAATTAAT ATTGGATGAG TATTTCAACA GTGTGACTAA TGTTTGAAAT TATTTTTTCC 2114 

TAAGAGTTTT TCCTATAACC TTCCAAAAGT CGTGATGTTT GTAGTTACCA TAATCCAGCT 2174 

TTGAAGTCCA AAAGGATTAA AGGCCGCCTC CCTTTGAAAA ATGCCATTTC CGGCCCCAAG 2234 

GCCTAGTGCC GTCCCTCCGG 2254 



10 



20 



25 



45 
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(2) INFORMATION FOR SEQ ID NO: 12: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 536 amino acids 
30 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asp Lys Val Gly Lys Met Trp Asn Asn Leu Lys Tyr Arg Cys Gin 
1 5 10 .15 

40 Asn Leu Phe Ser His Glu Gly Gly Ser Arg Asn Glu Asn Val Glu Met 

20 25 30 

Asn Pro Asn Arg Cys Pro Ser Val Lys Glu Lys Ser lie Ser Leu Gly 
35 40 45 



Glu Ala Ala Pro Gin Gin Glu Ser Ser Pro Leu Arg Glu Asn Val Ala 
50 55 60 



Leu Gin Leu Gly Leu Ser Pro Ser Lys Thr Phe Ser Arg Arg Asn Gin 
50 65 70 75 80 

Asn Cys Ala Ala Glu lie Pro Gin Val Val Glu lie Ser He Glu Lys 

85 90 95 

55 Asp Ser Asp Ser Gly Ala Thr Pro Gly Thr Arg Leu Ala Arg Arg Asp 

100 105 110 



Ser Tyr Ser Arg His Ala Pro Trp Gly Gly Lys Lys Lys His Ser Cys 
115 120 125 

Ser Thr Lys Thr Gin Ser Ser Leu Asp Thr Glu Lys Lys Phe Gly Arg 
130 135 140 

Thr Arg Ser Gly Leu Gin Arg Arg Glu Arg Arg Tyr Gly Val Ser Ser 
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20 



35 



50 



145 150 155 160 

Met Gin Asp Met Asp Ser Val Ser Ser Arg Ala Val Gly Ser Arg Ser 

165 170 175 

Leu Arg Gin Arg Leu Gin Asp Thr Val Gly Leu Cys Phe Pro Met Arg 

180 185 190 



Thr Tyr Ser Lys Gin Ser Lys Pro Leu Phe Ser Asn Lys Arg Lys lie 
10 195 200 205 

His Leu Ser Glu Leu Met Leu Glu Lys Cys Pro Phe Pro Ala Gly Ser 
210 215 220 

15 Asp Leu Ala Gin Lys Trp His Leu lie Lys Gin His Thr Ala Pro Val 
225 230 235 240 



Ser Pro His Ser Thr Phe Phe Asp Thr Phe Asp Pro Ser Leu Val Ser 

245 250 255 

Thr Glu Asp Glu Glu Asp Arg Leu Arg Glu Arg Arg Arg Leu Ser lie 

260 265 270 



Glu Glu Gly Val Asp Pro Pro Pro Asn Ala Gin lie His Thr Phe Glu 
25 275 280 285 

Ala Thr Ala Gin Val Asn Pro Leu Phe Lys Leu Gly Pro Lys Leu Ala 
290 295 300 

30 Pro Gly Met Thr Glu lie Ser Gly Asp Ser Ser Ala lie Pro Gin Ala 
305 310 315 320 



Asn Cys Asp Ser Glu Glu Asp Thr Thr Thr Leu Cys Leu Gin Ser Arg 

325 330 335 

Arg Gin Lys Gin Arg Gin lie Ser Gly Asp Ser His Thr His Val Ser 

340 345 350 



Arg Gin Gly Ala Trp Lys Val His Thr Gin lie Asp Tyr lie His Cys 
40 355 360 365 

Leu Val Pro Asp Leu Leu Gin lie Thr Gly Asn Pro Cys Tyr Trp Gly 
370 375 380 

45 Val Met Asp Arg Tyr Glu Ala Glu Ala Leu Ser Glu Gly Lys Pro Glu 
385 390 395 400 



Gly Thr Phe Leu Leu Arg Asp Ser Ala Gin Glu Asp Tyr Leu Phe Ser 

405 410 415 

Val Ser Ser Ala Ala Thr Thr Gly Ser Leu His Ala Arg lie Glu Gin 

420 425 430 



Trp Asn His Asn Phe Ser Phe Asp Ala His Asp Pro Cys Val Phe His 
55 435 440 445 

Ser Ser Thr Val Thr Gly Leu Leu Glu His Tyr Lys Asp Pro Ser Ser 
450 455 460 

60 Cys Met Phe Phe Glu Pro Leu Leu Thr lie Ser Leu Asn Arg Thr Phe 
465 470 475 480 

Pro Phe Ser Leu Gin Tyr lie Cys Arg Ala Val lie Cys Arg Cys Thr 

485 490 495 
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25 



Thr Tyr Asp Gly lie Asp Gly Leu Pro Leu Pro Ser Met Leu Gin Asp 

500 505 510 

Phe Leu Lys Glu Tyr His Tyr Lys Gin Lys Val Arg Val Arg Trp Leu 
515 520 525 

Glu Arg Glu Pro Val Lys Ala Lys 
530 535 

(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2206 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 2.. 1375 



(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 2078 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
30 or G at positions: 2078, and 2116. • 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION: 2063 

35 (D) OTHER INFORMATION: /note= "Nucleotide may be G or C at 

position 2063 . " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

40 

G GAG CGC GGC CTG GAG ACT AAC AGC TGC TCG GAA GAG GAG CTC AGC 46 
Glu Arg Gly Leu Glu Thr Asn Ser Cys Ser Glu Glu Glu Leu Ser 
15 10 15 

45 AGC CCG GGT CGC GGA GGA GGA GGG GGC GGC CGG CTT CTG CTG CAG CCC 94 
Ser Pro Gly Arg Gly Gly Gly Gly Gly Gly Arg Leu Leu Leu Gin Pro 

20 25 30 

CCA GGC CCT GAA TTA CCT CCG GTG CCC TTC CCG CTG CAG GAC TTG GTC 142 
50 Pro Gly Pro Glu Leu Pro Pro Val Pro Phe Pro Leu Gin Asp Leu Val 

35 40 45 

CCT CTG GGG CGC CTG AGT AGA GGG GAG CAG CAG CAG CAG CAG CAG CAG 190 
Pro Leu Gly Arg Leu Ser Arg Gly Glu Gin Gin Gin Gin Gin Gin Gin 
55 50 55 60 

CAA CCT CCC CCG CCC CCG CCT CCT CCC GGG CCC CTC CGG CCA CTC GCG 238 
Gin Pro Pro Pro Pro Pro Pro Pro Pro Gly Pro Leu Arg Pro Leu Ala 
65 70 75 



60 



GGT CCT TCT CGG AAG GGC TCC TTC AAA ATC CGC CTC AGT CGC CTC TTT 286 
Gly Pro Ser Arg Lys Gly Ser Phe Lys lie Arg Leu Ser Arg Leu Phe 
80 85 90 95 
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CGC ACC AAG AGC TGC AAC GGT GGC TCC GGC GGT GGG GAT GGG ACC GGC 334 
Arg Thr Lys Ser Cys Asn Gly Gly Ser Gly Gly Gly Asp Gly Thr Gly 

100 105 110 

5 AAG AGG CCT TCT GGA GAG CTG GCT GCT TCA GCT GCG AGC CTG ACA GAC 382 
Lys Arg Pro Ser Gly Glu Leu Ala Ala Ser Ala Ala Ser Leu Thr Asp 

115 120 125 

ATG GGA GGC TCT GCG GGC CGG GAG CTG GAC GCG GGG AGG AAA CCC AAG 430 
10 Met Gly Gly Ser Ala Gly Arg Glu Leu Asp Ala Gly Arg Lys Pro Lys 

130 135 140 

TTG ACA AGA ACT CAA AGT GCC TTT TCT CCG GTC TCC TTC AGC CCC CTG 478 
Leu Thr Arg Thr Gin Ser Ala Phe Ser Pro Val Ser Phe Ser Pro Leu 
15 145 150 155 

TTC ACA GGT GAA ACT GTG TCG CTT GTG GAT GTG GAC ATT TCT CAG CGG 526 
Phe Thr Gly Glu Thr Val Ser Leu Val Asp Val Asp lie Ser Gin Arg 
160 165 170 175 

20 

GGC CTG ACC TCT CCA CAC CCT CCA ACT CCC CCT CCT CCT CCG AGA AGA 574 
Gly Leu Thr Ser Pro His Pro Pro Thr Pro Pro Pro Pro Pro Arg Arg 

180 185 190 

25 AGC CTC AGC CTC CTA GAT GAT ATC AGT GGG ACG CTG CCT ACA TCT GTC 622 
Ser Leu Ser Leu Leu Asp Asp lie Ser Gly Thr Leu Pro Thr Ser Val 

195 200 205 

CTT GTG GCT CCG ATG GGG TCT TCC TTG CAG TCT TTC CCC CTA CCT CCG 670 
30 Leu Val Ala Pro Met Gly Ser Ser Leu Gin Ser Phe Pro Leu Pro Pro 

210 215 220 

CCT CCT CCA CCC CAT GCC CCA GAT GCA TTT CCC CGG ATT GCT CCC ATC 718 
Pro Pro Pro Pro His Ala Pro Asp Ala Phe Pro Arg lie Ala Pro lie 
35 225 230 235 

CGA GCA GCT GAA TCC CTG CAC AGC CAA CCC CCA CAG CAC CTC CAG TGT 766 
Arg Ala Ala Glu Ser Leu His Ser Gin Pro Pro Gin His Leu Gin Cys 
240 245 250 255 

40 

CCC CTC TAC CGG CCT GAC TCG AGC AGC TTT GCA GCC AGC CTT CGA GAG 814 
Pro Leu Tyr Arg Pro Asp Ser Ser Ser Phe Ala Ala Ser Leu Arg Glu 

260 265 270 

■ 

45 TTG GAG AAG TGT GGT TGG TAT TGG GGG CCA ATG AAT TGG GAA GAT GCA 862 
Leu Glu Lys Cys Gly Trp Tyr Trp Gly Pro Met Asn Trp Glu Asp Ala 

275 280 285 

GAG ATG AAG CTG AAA GGG AAA CCA GAT GGT TCT TTC CTG GTA CGA GAC 910 
50 Glu Met Lys Leu Lys Gly Lys Pro Asp Gly Ser Phe Leu Val Arg Asp 

290 295 300 

AGT TCT GAT CCT CGT TAC ATC CTG AGC CTC AGT TTC CGA TCA CAG GGT 958 
Ser Ser Asp Pro Arg Tyr lie Leu Ser Leu Ser Phe Arg Ser Gin Gly 
55 305 310 315 

• 

ATC ACC CAC CAC ACT AGA ATG GAG CAC TAC AGA GGA ACC TTC AGC CTG 1006 
lie Thr His His Thr Arg Met Glu His Tyr Arg Gly Thr Phe Ser Leu 
320 325 330 335 



60 



TGG TGT CAT CCC AAG TTT GAG GAC CGC TGT CAA TCT GTT GTA GAG TTT 1054 
Trp Cys His Pro Lys Phe Glu Asp Arg Cys Gin Ser Val Val Glu Phe 

340 345 350 
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ATT AAG AGA GCC ATT ATG CAC TCC AAG AAT GGA AAG TTT CTC TAT TTC 1102 
lie Lys Arg Ala lie Met His Ser Lys Asn Gly Lys Phe Leu Tyr Phe 

355 360 365 

5 TTA AGA TCC AGG GTT CCA GGA CTG CCA CCA ACT CCT GTC CAG CTG CTC 1150 
Leu Arg Ser Arg Val Pro Gly Leu Pro Pro Thr Pro Val Gin Leu Leu 
370 375 380 

TAT CCA GTG TCC CGA TTC AGC AAT GTC AAA TCC CTC CAG CAC CTT TGC 1198 
10 Tyr Pro Val Ser Arg Phe Ser Asn Val Lys Ser Leu Gin His Leu Cys 
385 390 395 

AGA TTC CGG ATA CGA CAG CTC GTC AGG ATA GAT CAC ATC CCA GAT CTC 1246 
Arg Phe Arg lie Arg Gin Leu Val Arg lie Asp His lie Pro Asp Leu 
15 400 405 410 415 

CCA CTG CCT AAA CCT CTG ATC TCT TAT ATC CGA AAG TTC TAG TAC TAT 1294 
Pro Leu Pro Lys Pro Leu lie Ser Tyr lie Arg Lys Phe Tyr Tyr Tyr 

420 425 430 



20 



30 



GAT CCT CAG GAA GAG GTA TAC CTG TCT CTA AAG GAA GCG CAG CTC ATT 1342 
Asp Pro Gin Glu Glu Val Tyr Leu Ser Leu Lys Glu Ala Gin Leu lie 

435 440 445 



25 TCC AAA CAG AAG CAA GAG GTG GAA CCC TCC ACG TAGCGAGGGG CTCCCTGCTG 1395 
Ser Lys Gin Lys Gin Glu Val Glu Pro Ser Thr 
450 455 



GTCACCACCA AGGGCATTTG GTTGCCAAGC TCCAGCTTTG AAGAACCAAA TTAAGCTACC 1455 

ATGAAAAGAA GAGGAAAAGT GAGGGAACAG GAAGGTTGGG ATTCTCTGTG CAGAGACTTT 1515 

GGTTCCCCAC GCAGCCCTGG GGCTTGGAAG AAGCACATGA CCGTACTCTG CGTGGGGCTC 1575 

35 CACCTCACAC CCACCCCTGG GCATCTTAGG ACTGGAGGGG CTCCTTGGAA AACTGGAAGA 1635 

AGTCTCAACA CTGTTTCTTT TTCAAAAAAA AAAAAAAAAA AGATGCGGCC GCAAGCTTAT 1695 

TCCCTTTAGT GAGGGTTAAT TTTAGCTTGG CACTGGCCGT CGTTTTACAA CGTCGTGACT 1755 

40 

GGGAAAACCC TGGCGTTACC CAACTTAATC GCCTTGCAGC ACATCCCCCT TTCGCCAGCT 1815 

GGCGTAATAG CGAAGAGGCC CGCACCGATC GCCCTTCCCA ACAGTTGCGC AGCCTGAATG 1875 

45 GCGAATGGGA CGCGCCCTGT AGCGGCGCAT TAACGCGCGG CGGGTGTGGT GGTTACGCGC 1935 

AGCGTGACCG CTACACTTGC CAGCGCCCTA CGCCCGCTCC TTTCGCTTTC TTCCCTTCCT 1995 

TTCTCGCCAC GTTCGCCGGC TTTCCCCGTC AACTCTAAAT CGGGGGCTCC CTTTAGGTTC 2055 

50 

CGATTTACTG CTTTACGCAC TCCACCCCAA AACTTGATTA GGTGATGTCA CTTATGGCAC 2115 

CCCTGATAAC GTTTCCCCTT ACTTTGATCA CTTCTTTATA TGATCTTTCC AATGAAACAT 2175 

55 CACCTACTCG TCATCTTTAT TTAAAGATTT G 2206 



(2) INFORMATION FOR SEQ ID NO: 14: 

60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 458 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

5 Glu Arg Gly Leu Glu Thr Asn Ser Cys Ser Glu Glu Glu Leu Ser Ser 
15 10 15 



10 



25 



40 



55 



Pro Gly Arg Gly Gly Gly Gly Gly Gly Arg Leu Leu Leu Gin Pro Pro 

20 25 30 

Gly Pro Glu Leu Pro Pro Val Pro Phe Pro Leu Gin Asp Leu Val Pro 
35 40 45 



Leu Gly Arg Leu Ser Arg Gly Glu Gin Gin Gin Gin Gin Gin Gin Gin 
15 50 55 60 

Pro Pro Pro Pro Pro Pro Pro Pro Gly Pro Leu Arg Pro Leu Ala Gly 
65 70 75 80 

20 Pro Ser Arg Lys Gly Ser Phe Lys lie Arg Leu Ser Arg Leu Phe Arg 

85 90 o 95 



Thr Lys Ser Cys Asn Gly Gly Ser Gly Gly Gly Asp Gly Thr Gly Lys 

100 105 110 

Arg Pro Ser Gly Glu Leu Ala Ala Ser Ala Ala Ser Leu Thr Asp Met 
115 120 125 



Gly Gly Ser Ala Gly Arg Glu Leu Asp Ala Gly Arg Lys Pro Lys Leu 
30 130 135 140 

Thr Arg Thr Gin Ser Ala Phe Ser Pro Val Ser Phe Ser Pro Leu Phe 
145 150 155 160 

35 Thr Gly Glu Thr Val Ser Leu Val Asp Val Asp lie Ser Gin Arg Gly 

165 170 175 



Leu Thr Ser Pro His Pro Pro Thr Pro Pro Pro Pro Pro Arg Arg Ser 

180 185 190 

Leu Ser Leu Leu Asp Asp lie Ser Gly Thr Leu Pro Thr Ser Val Leu 
195 200 205 



Val Ala Pro Met Gly Ser Ser Leu Gin Ser Phe Pro Leu Pro Pro Pro 
45 210 215 220 

Pro Pro Pro His Ala Pro Asp Ala Phe Pro Arg lie Ala Pro lie Arg 
225 230 235 240 

50 Ala Ala Glu Ser Leu His Ser Gin Pro Pro Gin His Leu Gin Cys Pro 

245 250 255 



Leu Tyr Arg Pro Asp Ser Ser Ser Phe Ala Ala Ser Leu Arg Glu Leu 

260 265 270 

Glu Lys Cys Gly Trp Tyr Trp Gly Pro Met Asn Trp Glu Asp Ala Glu 
275 280 285 



Met Lys Leu Lys Gly Lys Pro Asp Gly Ser Phe Leu Val Arg Asp Ser 
60 290 295 300 

Ser Asp Pro Arg Tyr lie Leu Ser Leu Ser Phe Arg Ser Gin Gly He 
305 310 315 320 
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Thr His His Thr Arg Met Glu His Tyr Arg Gly Thr Phe Ser Leu Trp 

325 330 335 

Cys His Pro Lys Phe Glu Asp Arg Cys Gin Ser Val Val Glu Phe He 
5 340 345 350 

Lys Arg Ala He Met His Ser Lys Asn Gly Lys Phe Leu Tyr Phe Leu 
355 360 365 

10 Arg Ser Arg Val Pro Gly Leu Pro Pro Thr Pro Val Gin Leu Leu Tyr 
370 375 380 



15 



35 



Pro Val Ser Arg Phe Ser Asn Val Lys Ser Leu Gin His Leu Cys Arg 
385 390 395 400 

Phe Arg He Arg Gin Leu Val Arg He Asp His He Pro Asp Leu Pro 

405 410 415 



Leu Pro Lys Pro Leu He Ser Tyr He Arg Lys Phe Tyr Tyr Tyr Asp 
20 420 425 430 

Pro Gin Glu Glu Val Tyr Leu Ser Leu Lys Glu Ala Gin Leu He Ser 
435 440 445 

25 Lys Gin Lys Gin Glu Val Glu Pro Ser Thr 
450 455 

(2) INFORMATION FOR SEQ ID NO: 15: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1390 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 
40 (A) NAME/KEY: CDS 

(B) LOCATION: 453.. 1388 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
45 (B) LOCATION: 108 

(D) OTHER INFORMATION: /note= "Nucleotide may be A, C, T, 
or G at positions: 108, and 109." 

(ix) FEATURE: 
50 (A) NAME/KEY: misc_feature 

(B) LOCATION: 236 

(D) OTHER INFORMATION: /note= "Nucleotide may be A or G at 
positions: 236, 238, and 1258." 

55 (ix) FEATURE: 

(A) NAME/KEY: misc_ feature 

(B) LOCATION: 233 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or T at 
position 233." 



60 



(ix) FEATURE: 

(A) NAME/KEY: mis cofeature 

(B) LOCATION: 234 

(D) OTHER INFORMATION: /note= "Nucleotide may be G or C at 
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position 234. ■ 

(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 
5 (B) LOCATION: 237 

(D) OTHER INFORMATION: /note= "Nucleotide may be C or T at 
position 237 . " 

(ix) FEATURE: 
10 (A) NAME/KEY: misc_f eature 

(B) LOCATION: 239 

(D) OTHER INFORMATION : /note= "Nucleotide may be A or T at 
position 239 . " 

15 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CGGACGCGTG GGTTTGGCTG TGAATATTCT ATTTGCTTGC AGTATCTGTT TCTCTTCCTA 60 
20 GGCTCAAGTT GGTGACCCAA GCCTATTGTA AACAAGTGAT TATCTCACCG GGAGATGCCA . 120 
ATGGAGTAAC AATTTGTTAA CCTTACGTTT TCTGTCTGTA TATTTTTTTA AAAATCTGGT 180 
AGTTTCTGGA AAAAAAAGAG AAGGGGGTTT GTAGTACTTA ACCCTATTTA TTGCCACGAG 240 
TTTTAGTTAA TTAGTTTTTG GAATAAATGG ATTTCAGTAT AGCTTTGTGG TTAAATTGCA 300 
TTGCCTTTAT TTTATGTTTA GGCTTATTTT TAAATTAACA TTTAACAGAA ACATTTGAAA 360 
30 TAGAATTTGC ATGTCTGCCT TAATTAACTT AAAGACTGAT TTTAATCTGA CTATGACACT 420 



25 



35 



GAGCATATTC TTTAAATTAC TCATAATTTA TA ATG CTT AAT ATA ATC TTA ATT 473 

Met Leu Asn lie lie Leu lie 
1 5 

AAA TTT AGC AGT TTT AGT ATA AGA TGT GCC ATT TTG TCC TCT GTA TGT 521 
Lys Phe Ser Ser Phe Ser lie Arg Cys Ala lie Leu Ser Ser Val Cys 
10 15 20 

40 CTG AAT GAA GCT ATA ACA TTT GCC TTT TTA TTG CAG GTT TTC CTT TGG 569 
Leu Asn Glu Ala lie Thr Phe Ala Phe Leu Leu Gin Val Phe Leu Trp 
25 30 35 

AAT ATG GAT AAA TAC ACC ATG ATA CGG AAA CTA GAA GGA CAT CAC CAT 617 
45 Asn Met Asp Lys Tyr Thr Met lie Arg Lys Leu Glu Gly His His His 
40 45 50 55 

GAT GTG GTA GCT TGT GAC TTT TCT CCT GAT GGA GCA TTA CTG GCT ACT 665 
Asp Val Val Ala Cys Asp Phe Ser Pro Asp Gly Ala Leu Leu Ala Thr 
50 60 65 70 

GCA TCT TAT GAT ACT CGA GTA TAT ATC TGG GAT CCA CAT AAT GGA GAC 713 
Ala Ser Tyr Asp Thr Arg Val Tyr lie Trp Asp Pro His Asn Gly Asp 

75 80 85 

55 

ATT CTG ATG GAA TTT GGG CAC CTG TTT CCC CCA CCT ACT CCA ATA TTT 761 
lie Leu Met Glu Phe Gly His Leu Phe Pro Pro Pro Thr Pro lie Phe 
90 95 100 

60 GCT GGA GGA GCA AAT GAC CGG TGG GTA CGA TCT GTA TCT TTT AGC CAT 809 
Ala Gly Gly Ala Asn Asp Arg Trp Val Arg Ser Val Ser Phe Ser His 
105 110 115 



GAT GGA CTG CAT GTT GCA AGC CTT GCT GAT GAT AAA ATG GTG AGG TTC 



857 
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Asp Gly Leu His Val Ala Ser Leu Ala Asp Asp Lys Met Val Arg Phe 
120 125 130 135 

TGG AGA ATT GAT GAG GAT TAT CCA GTG CAA GTT GCA CCT TTG AGC AAT 905 
5 Trp Arg lie Asp Glu Asp Tyr Pro Val Gin Val Ala Pro Leu Ser Asn 

140 145 150 

GGT CTT TGC TGT GCC TTC TCT ACT GAT GGC AGT GTT TTA GCT GCT GGG 953 
Gly Leu Cys Cys Ala Phe Ser Thr Asp Gly Ser Val Leu Ala Ala Gly 
10 155 160 165 

AC A CAT GAC GGA AGT GTG TAT TTT TGG GCC ACT CCA CGG CAG GTC CCT 1001 
Thr His Asp Gly Ser Val Tyr Phe Trp Ala Thr Pro Arg Gin Val Pro 
170 175 180 

15 

AGC CTG CAA CAT TTA TGT CGC ATG TCA ATC CGA AGA GTG ATG CCC ACC 1049 
Ser Leu Gin His Leu Cys Arg Met Ser lie Arg Arg Val Met Pro Thr 
185 190 195 

20 CAA GAA GTT CAG GAG CTG CCG ATT CCT TCC AAG CTT TTG GAG TTT CTC 1097 
Gin Glu Val Gin Glu Leu Pro lie Pro Ser Lys Leu Leu Glu Phe Leu 
200 205 210 215 

TCG TAT CGT ATT TAG AAG ATT CTG CCT TCC CTA GTA GTA GGG ACT GAC 1145 
25 Ser Tyr Arg lie * Lys lie Leu Pro Ser Leu Val Val Gly Thr Asp 

220 225 230 

AGA ATA CAC TTA ACA CAA ACC TCA AGC TTT ACT GAC TTC AAT TAT CTG 1193 
Arg lie His Leu Thr Gin Thr Ser Ser Phe Thr Asp Phe Asn Tyr Leu 
30 235 240 245 

TTT TTA AAG ACG TAG AAG ATT TAT TTA ATT TGA TAT GTT CTT GTA CTG 1241 
Phe Leu Lys Thr * Lys lie Tyr Leu lie * Tyr Val Leu Val Leu 
250 255 260 



35 



50 



CAT TTT GAT CAG TTG AAG CTT TTA AAA TAT TAT TTA TAG ACA ATA GAA 1289 
His Phe Asp Gin Leu Lys Leu Leu Lys Tyr Tyr Leu * Thr lie Glu 
265 270 275 



40 GTA TTT CTG AAC ATA TCA AAT ATA AAT TTT TTT AAA GAT CTA ACT GTG 1337 
Val Phe Leu Asn lie Ser Asn lie Asn Phe Phe Lys Asp Leu Thr Val 
280 285 290 295 

AAA AAC ATA CAT ACC TGT ACA TAT TTA GAT ATA AGC TGC TAT ATG TTG 1385 
45 Lys Asn lie His Thr Cys Thr Tyr Leu Asp lie Ser Cys Tyr Met Leu 

300 305 310 



AAT GG 
Asn 



(2) INFORMATION FOR SEQ ID NO: 16: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 312 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

60 (ii) MOLECULE TYPE: protein 



1390 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
Met Leu Asn lie lie Leu lie Lys Phe Ser Ser Phe Ser lie Arg Cys 
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5 



20 



35 



50 



15 10 15 

Ala lie Leu Ser Ser Val Cys Leu Asn Glu Ala lie Thr Phe Ala Phe 

20 25 30 

Leu Leu Gin Val Phe Leu Trp Asn Met Asp Lys Tyr Thr Met lie Arg 
35 40 45 



Lys Leu Glu Gly His His His Asp Val Val Ala Cys Asp Phe Ser Pro 
10 50 55 60 

Asp Gly Ala Leu Leu Ala Thr Ala Ser Tyr Asp Thr Arg Val Tyr He 

65 70 75 80 

15 Trp Asp Pro His Asn Gly Asp He Leu Met Glu Phe Gly His Leu Phe 

85 90 95 



Pro Pro Pro Thr Pro He Phe Ala Gly Gly Ala Asn Asp Arg Trp Val 

100 105 110 

Arg Ser Val Ser Phe Ser His Asp Gly Leu His Val Ala Ser Leu Ala 

115 120 125 



Asp Asp Lys Met Val Arg Phe Trp Arg He Asp Glu Asp Tyr Pro Val 
25 130 135 140 

Gin Val Ala Pro Leu Ser Asn Gly Leu Cys Cys Ala Phe Ser Thr Asp 
145 150 155 160 

30 Gly Ser Val Leu Ala Ala Gly Thr His Asp Gly Ser Val Tyr Phe Trp 

165 170 175 



Ala Thr Pro Arg Gin Val Pro Ser Leu Gin His Leu Cys Arg Met Ser 

180 185 190 

He Arg Arg Val Met Pro Thr Gin Glu Val Gin Glu Leu Pro He Pro 
195 200 205 



Ser Lys Leu Leu Glu Phe Leu Ser Tyr Arg He * Lys He Leu Pro 
40 210 215 220 

Ser Leu Val Val Gly Thr Asp Arg He His Leu Thr Gin Thr Ser Ser 
225 230 235 240 

45 Phe Thr Asp Phe Asn Tyr Leu Phe Leu Lys Thr * Lys He Tyr Leu 

245 250 255 



He * Tyr Val Leu Val Leu His Phe Asp Gin Leu Lys Leu Leu Lys 

260 265 270 

Tyr Tyr Leu * Thr He Glu Val Phe Leu Asn He Ser Asn He Asn 
275 280 285 



Phe Phe Lys Asp Leu Thr Val Lys Asn He His Thr Cys Thr Tyr Leu 
55 290 295 300 

Asp He Ser Cys Tyr Met Leu Asn 
305 310 

60 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 257 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

5 



10 



25 



40 



55 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Val Leu Cys Val Gin Gly Ser Cys Pro Leu Leu Ala Val Glu Gin 
15 10 15 



lie Gly Arg Arg Pro Leu Trp Ala Gin Ser Leu Glu Leu Pro Gly Pro 
15 20 25 30 

Ala Met Gin Pro Leu Pro Thr Gly Ala Phe Pro Glu Glu Val Thr Glu 
35 40 45 

20 Glu Thr Pro Val Gin Ala Glu Asn Glu Pro Lys Val Leu Asp Pro Glu 

50 55 60 



Gly Asp Leu Leu Cys He Ala Lys Thr Phe Ser Tyr Leu Arg Glu Ser 
65 70 75 80 

Gly Trp Tyr Trp Gly Ser He Thr Ala Ser Glu Ala Arg Gin His Leu 

85 90 95 



Gin Lys Met Pro Glu Gly Thr Phe Leu Val Arg Asp Ser Thr His Pro 
30 100 105 110 

Ser Tyr Leu Phe Thr Leu Ser Val Lys Thr Thr Arg Gly Pro Thr Asn 
115 120 125 

35 Val Arg He Glu Tyr Ala Asp Ser Ser Phe Arg Leu Asp Ser Asn Cys 

130 135 140 



Leu Ser Arg Pro Arg He Leu Ala Phe Pro Asp Val Val Ser Leu Val 
145 150 155 160 

Gin His Tyr Val Ala Ser Cys Ala Ala Asp Thr Arg Ser Asp Ser Pro 

165 170 175 



Asp Pro Ala Pro Thr Pro Ala Leu Pro Met Ser Lys Gin Asp Ala Pro 
45 180 185 190 

Ser Asp Ser Val Leu Pro He Pro Val Ala Thr Ala Val His Leu Lys 
195 200 205 

50 Leu Val Gin Pro Phe Val Arg Arg Ser Ser Ala Arg Ser Leu Gin His 

210 215 220 



Leu Cys Arg Leu Val He Asn Arg Leu Val Ala Asp Val Asp Cys Leu 
225 230 235 240 

Pro Leu Pro Arg Arg Met Ala Asp Tyr Leu Arg Gin Tyr Pro Phe Gin 

245 250 255 

Leu 



(2) INFORMATION FOR SEQ ID NO: 18: 
(i) SEQUENCE CHARACTERISTICS 
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10 



15 



30 



45 



(A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Val Ala His Asn Gin Val Ala Ala Asp Asn Ala Val Ser Thr Ala 
1 5 10 15 

Ala Glu Pro Arg Arg Arg Pro Glu Pro Ser Ser Ser Ser Ser Ser Ser 

20 25 30 



Pro Ala Ala Pro Ala Arg Pro Arg Pro Cys Pro Ala Val Pro Ala Pro 
20 35 40 45 

Ala Pro Gly Asp Thr His Phe Arg Thr Phe Arg Ser His Ala Asp Tyr 
50 55 60 

25 Arg Arg lie Thr Arg Ala Ser Ala Leu Leu Asp Ala Cys Gly Phe Tyr 

65 70 75 80 



Trp Gly Pro Leu Ser Val His Gly Ala His Glu Arg Leu Arg Ala Glu 

85 90 95 

Pro Val Gly Thr Phe Leu Val Arg Asp Ser Arg Gin Arg Asn Cys Phe 

100 105 ' 110 



Phe Ala Leu Ser Val Lys Met Ala Ser Gly Pro Thr Ser lie Arg Val 
35 115 120 125 

His Phe Gin Ala Gly Arg Phe His Leu Asp Gly Ser Arg Glu Ser Phe 

130 135 140 

40 Asp Cys Leu Phe Glu Leu Leu Glu His Tyr Val Ala Ala Pro Arg Arg 

145 150 155 ~ 160 



Met Leu Gly Ala Pro Leu Arg Gin Arg Arg Val Arg Pro Leu Gin Glu 

165 170 175 

Leu Cys Arg Gin Arg lie Val Ala Thr Val Gly Arg Glu Asn Leu Ala 

180 185 190 



Arg lie Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe Pro 
50 195 200 205 

Phe Gin He 
210 

55 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 

60 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

5 

Met Val Ala Arg Asn Gin Val Ala Ala Asp Asn Ala lie Ser Pro Ala 
15 10 15 

Ala Glu Pro Arg Arg Arg Ser Glu Pro Ser Ser Ser Ser Ser Ser Ser 
10 20 25 30 

Ser Pro Ala Ala Pro Val Arg Pro Arg Pro Cys Pro Ala Val Pro Ala 
35 40 45 

15 Pro Ala Pro Gly Asp Thr His Phe Arg Thr Phe Arg Ser His Ser Asp 

50 55 60 



20 



35 



55 



Tyr Arg Arg lie Thr Arg Thr Ser Ala Leu Leu Asp Ala Cys Gly Phe 
65 70 75 80 

Tyr Trp Gly Pro Leu Ser Val His Gly Ala His Glu Arg Leu Arg Ala 

85 90 95 



Glu Pro Val Gly Thr Phe Leu Val Arg Asp Ser Arg Gin Arg Asn Cys 
25 100 105 110 

Phe Phe Ala Leu Ser Val Lys Met Ala Ser Gly Pro Thr Ser lie Arg 
115 120 125 

30 Val His Phe Gin Ala Gly Arg Phe His Leu Asp Gly Ser Arg Glu Thr 

130 135 140 



Phe Asp Cys Leu Phe Glu Leu Leu Glu His Tyr Val Ala Ala Pro Arg 
145 150 155 160 

Arg Met Leu Gly Ala Pro Leu Arg Gin Arg Arg Val Arg Pro Leu Gin 

165 170 175 



Glu Leu Cys Arg Gin Arg lie Val Ala Ala Val Gly Arg Glu Asn Leu 
40 180 185 190 

Ala Arg lie Pro Leu Asn Pro Val Leu Arg Asp Tyr Leu Ser Ser Phe 
195 200 205 

45 Pro Phe Gin lie 

210 

(2) INFORMATION FOR SEQ ID NO: 20: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: not relevant 



(ii) MOLECULE TYPE: peptide 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Ala Leu Ser Pro Ala Ala Thr Leu Thr Ala Trp Pro Ala Asp Ser Ala 
15 10 15 
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10 



25 



40 



60 



Arg Arg Gly Pro Gly Cys Thr Ala Ser Gly Tyr Pro Val Pro Ala Ala 

20 25 30 

Arg Ala Pro Ala Ala Gly Asp Gin Trp Val Thr Ala Ala Ala Arg Asp 
35 40 45 

Phe Val He Arg Pro Pro Gly Ser Gly Glu Lys Glu Pro His Pro Phe 
50 55 60 

Ser Leu Cys His His Phe Gly His Pro Ala Gly Leu Val Leu Gly Phe 
65 70 75 80 



Ala Leu Thr Ser Arg Lys Asp Ala Asn Pro Ser Leu Thr Pro Ala Arg 
15 85 90 95 

Ala Ala Thr Cys Leu Cys Arg Gly Asp Pro Ser Leu Met Thr Leu Arg 

100 105 110 

20 Cys Leu Glu Pro Ser Gly Asn Gly Gly Glu Gly Thr Arg Ser Gin Trp 

115 120 125 



Gly Thr Ala Gly Ser Ala Glu Glu Pro Ser Pro Gin Ala Ala Arg Leu 
130 135 140 

Ala Lys Ala Leu Arg Glu Leu Gly Gin Thr Gly Trp Tyr Trp Gly Ser 
145 150 155 ~ 160 



Met Thr Val Asn Glu Ala Lys Glu Lys Leu Lys Glu Ala Pro Glu Gly 
30 165 170 175 

Thr Phe Leu He Arg Asp Ser Ser His Ser Asp Tyr Leu Leu Thr He 

180 185 190 

35 Ser Val Lys Thr Ser Ala Gly Pro Thr Asn Leu Arg He Glu Tyr Gin 

195 200 205 



Asp Gly Lys Phe Arg Leu Asp Ser He He Cys Val Lys Ser Lys Leu 
210 215 220 

Lys Gin Phe Asp Ser Val Val His Leu He Asp Tyr Tyr Val Gin Met 

225 230 235 240 



Cys Lys Asp Lys Arg Thr Gly Pro Glu Ala Pro Arg Asn Gly Thr Val 
45 245 250 255 

His Leu Tyr Leu Thr Lys Pro Leu Tyr Thr Ser Ala Pro Ser Leu Gin 

260 265 270 

50 His Leu Cys Arg Leu Thr He Asn Lys Cys Thr Gly Ala He Trp Gly 

275 280 285 

Leu Pro Leu Pro Thr Arg Leu Lys Asp Tyr Leu Glu Glu Tyr Lys Phe 
290 295 300 

Gin Val 
305 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 
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15 



30 



45 



(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

10 Met Val Thr His Ser Lys Phe Pro Ala Ala Gly Met Ser Arg Pro Leu 

1 '5 10 15 



Asp Thr Ser Leu Arg Leu Lys Thr Phe Ser Ser Lys Ser Glu Tyr Gin 

20 25 30 

Leu Val Val Asn Ala Val Arg Lys Leu Gin Glu Ser Gly Phe Tyr Trp 

35 40 45 



Ser Ala Val Thr Gly Gly Glu Ala Asn Leu Leu Leu Ser Ala Glu Pro 
20 50 55 60 

Ala Gly Thr Phe Leu lie Arg Asp Ser Ser Asp Gin Arg His Phe Phe 
65 70 75 80 

25 Ala Leu Ser Val Lys Thr Gin Ser Gly Thr Lys Asn Leu Arg He Gin 

85 90 95 



Cys Glu Gly Gly Ser Phe Ser Leu Gin Ser Asp Pro Arg Ser Thr Gin 

100 105 110 

Pro Val Pro Arg Phe Asp Cys Val Leu Lys Leu Val Tyr His Tyr Met 
115 120 125 



Pro Pro Pro Gly Ala Pro Ser Phe Pro Ser Pro Pro Thr Glu Pro Ser 

35 130 135 140 

Ser Glu Val Pro Glu Gin Pro Ser Ala Gin Pro Leu Pro Gly Ser Pro 

145 150 155 160 

40 Pro Arg Arg Ala Tyr Tyr He Tyr Ser Gly Gly Glu Lys He Pro Leu 

165 170 175 



Val Leu Ser Arg Pro Leu Ser Ser Asn Val Ala Thr Leu Gin His Leu 

180 185 190 

Cys Arg Lys Thr Val Asn Gly His Leu Asp Ser Tyr Glu Lys Val Thr 
195 200 205 



Gin Leu Pro Gly Pro He Arg Glu Phe Leu Asp Gin Tyr Asp Ala Pro 
50 210 215 220 

Leu 
225 

55 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acids 

(B) TYPE: amino acid 

60 (C) STRANDEDNESS: not relevant 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

5 

Met Val Thr His Ser Lys Phe Pro Ala Ala Gly Met Ser Arg Pro Leu 
15 10 15 

Asp Thr Ser Leu Arg Leu Lys Thr Phe Ser Ser Lys Ser Glu Tyr Gin 
10 20 25 30 

Leu Val Val Asn Ala Val Arg Lys Leu Gin Glu Ser Gly Phe Tyr Trp 
35 40 45 

15 Ser Ala Val Thr Gly Gly Glu Ala Asn Leu Leu Leu Ser Ala Glu Pro 

50 55 60 



20 



35 



50 



Ala Gly Thr Phe Leu lie Arg Asp Ser Ser Asp Gin Arg His Phe Phe 
65 70 75 80 

Thr Leu Ser Val Lys Thr Gin Ser Gly Thr Lys Asn Leu Arg lie Gin 

85 90 " 95 



Cys Glu Gly Gly Ser Phe Ser Leu Gin Ser Asp Pro Arg Ser Thr Gin 
25 100 105 110 

Pro Val Pro Arg Phe Asp Cys Val Leu Lys Leu Val His His Tyr Met 
115 120 125 

30 Pro Pro Pro Gly Thr Pro Ser Phe Ser Leu Pro Pro Thr Glu Pro Ser 

130 135 140 



Ser Glu Val Pro Glu Gin Pro Pro Ala Gin Ala Leu Pro Gly Ser Thr 
145 150 155 160 

Pro Lys Arg Ala Tyr Tyr lie Tyr Ser Gly Gly Glu Lys lie Pro Leu 

165 170 175 



Val Leu Ser Arg Pro Leu Ser Ser Asn Val Ala Thr Leu Gin His Leu 
40 180 185 190 

Cys Arg Lys Thr Val Asn Gly His Leu Asp Ser Tyr Glu Lys Val Thr 
195 200 205 

45 Gin Leu Pro Gly Pro lie Arg Glu Phe Leu Asp Gin Tyr Asp Ala Pro 

210 215 220 



Leu 
225 

(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 510 amino acids 
55 (B) TYPE: amino acid 

(C) STRANDEDNESS : not relevant 

(D) TOPOLOGY: linear 



60 



(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
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10 



25 



40 



55 



Leu Tyr Trp Ser Ser Thr Val Val Ala Ala Ala Leu Glu Xaa Xaa Xaa 
15 10 15 

Xaa Xaa Gly Cys Xaa Xaa Xaa Glu Xaa Glu Gly Val Arg Ser Ser Pro 

20 25 30 

Val Val Ser Leu Ser Leu Pro Leu Xaa Arg Ala Arg Met Gly Arg Ala 
35 40 45 

Glu Leu Leu Glu Gly Lys Met Ser Thr Gin Asp Pro Ser Asp Leu Trp 
50 55 60 



Ser Arg Ser Asp Gly Glu Ala Glu Leu Leu Gin Asp Leu Gly Trp Tyr 
15 65 70 75 80 

His Gly Asn Leu Thr Arg His Ala Ala Glu Ala Leu Leu Leu Ser Asn 

85 90 95 

20 Gly Cys Asp Gly Ser Tyr Leu Leu Arg Asp Ser Asn Glu Thr Thr Gly 

100 105 110 



Leu Tyr Ser Leu Ser Val Arg Ala Lys Asp Ser Val Lys His Phe His 
115 120 125 

Val Glu Tyr Thr Gly Tyr Ser Phe Lys Phe Gly Phe Asn Glu Phe Ser 
130 135 140 



Ser Leu Lys Asp Phe Val Lys His Phe Ala Asn Gin Pro Leu lie Gly 
30 145 150 155 160 

Ser Glu Thr Gly Thr Leu Met Val Leu Lys His Pro Tyr Pro Arg Lys 

165 170 175 

35 Val Xaa Glu Pro Ser lie Tyr Glu Ser Val Arg Val His Thr Ala Met 

180 185 190 



Gin Thr Gly Arg Thr Glu Asp Asp Leu Val Pro Thr Ala Pro Ser Leu 
195 - 200 205 

Gly Thr Lys Glu Gly Tyr Leu Thr Lys Gin Gly Gly Leu Val Lys Thr 
210 215 220 



Trp Lys Thr Arg Trp Phe Thr Leu His Arg Asn Glu Leu Lys Tyr Phe 
45 225 230 235 240 

Lys Asp Gin Met Ser Pro Glu Pro lie Arg lie Leu Asp Leu Thr Glu 

245 250 255 

50 Cys Ser Ala Val Gin Phe Asp Tyr Ser Gin Glu Arg Val Asn Cys Phe 

260 265 270 



Cys Leu Val Phe Pro Phe Arg Thr Phe Tyr Leu Cys Ala Lys Thr Gly 
275 280 285 

Val Glu Ala Asp Glu Trp lie Lys lie Leu Arg Trp Lys Leu Ser Gin 
290 295 300 



lie Arg Lys Gin Leu Asn Gin Gly Glu Ala Arg Ser Asp Leu Gly Arg 
60 305 310 315 320 

Ser Ser Leu Asn Arg Ser Phe Leu Pro Arg Asn Ala Leu Ala Gin Glu 

325 330 335 
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Gin Val Glu Cys Phe Pro Xaa Arg Cys Asp Leu Xaa Gin Leu Gin Met 

340 345 350 

Lys Thr Asp Xaa Asp Phe Leu Ser Lys Thr Asn Gin Asn Arg Cys Xaa 
5 355 360 365 

Leu Gly Pro lie Tyr His Val Ala Asp Ser Leu Cys Cys Pro Ser Xaa 

370 375 380 

10 Met Leu Pro Xaa Pro Xaa Glu His Xaa Ser Asn His His Xaa Asp Arg 

385 390 395 ~ 400 



15 



30 



35 



45 



50 



60 



Lys Cys Leu Asn His His Ser Xaa Val Cys Ser Leu Leu Glu His Thr 

405 410 415 

Met Glu Glu Glu Gly Phe Leu Phe Ser Leu lie Val Val Pro Lys Pro 

420 425 430 



lie Asp Thr Ser Cys Leu Glu Ser His Cys Glu Ser Trp Ser Ala Cys 
20 435 440 445 

Leu Thr Xaa Arg Leu Cys Tyr Xaa Pro Arg Arg Lys Gin lie Leu Gly 
450 455 460 

25 Gly Leu Asp Asp Xaa Cys Arg lie Tyr lie Gin lie Glu Asn lie Lys 

465 470 475 480 



Tyr Phe Gin Gly Arg Gly Phe Phe Phe Xaa Phe Phe Pro Leu Tyr Thr 

485 490 495 

Lys Lys Lys Lys Lys Lys Leu Glu Gly Gly Pro Tyr Pro Xaa 

500 505 510 

(2) INFORMATION FOR SEQ ID NO: 24: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
40 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

TAAGGTCCAC GTCGCTCCGM AGCCATCACT ACAGKMCCGC GCCGTGGCCT CTGCGGCCCA 60 

CAAWCTCCGR GGAGACCTGC ATCAAGATGG AGGTGAGAGT CAAGGCCTTG GTTCACTCTT 120 

CCAGCCCGAG TCCAGCCCTG AATGGCGTCC GGAAGGATTT CCACGACCTC CAGTCTGAGA 180 

55 CCACGTGCCA GGAGCAAGCC AATTCACTGA AGAGCTCGGC TTCTCATAAT GGAGACCTGC 240 

ATCTTCACCT GGATGAACAT GTGCCTGTCG TTATTGGACT TATGCCTCAG GACTACATTC 300 

AGTATACTGT GCCTTTAGAT GAGGGGATGT ATCCTTTGGA AGGATCACGG AGCTATTGTC 360 

TGGACAGCTC TTCTCCCATG GAAGTCTCTG CGGTTCCTCC TCAAGTGGGA GGGCGCGCTT 420 

TCCCCGAGGA TGAGAGTCAG GTAGACCAGG ACCTAGTTGT CGCCCCAGAG ATCTTCGTGG 480 
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ATCAGTCCGT GAATGGCTTG TTGATTGGCA CCACGGGAGT CATGTTGCAG AGCCCGAGAG 540 

CGGGTCACGA TGATGTCCCT CCACTCTCAC CATTGCTACC TCCAATGCAG AATAATCAAA 600 

5 TCCAAAGGAA CTTCAGTGGA CTCACTGGCA CAGAAGCCCA CGTGGCTGAA AGTATGCGCT 660 

GTCATTTGAA TTTTGATCCG AACTCTGCTC CTGGGGTTGC AAGAGTTTAT GACTCAGTGC 720 

AAAGTAGTGG TCCCATGGTT GTGACAAGCC TTACAGAGGA GCTGAAAAAA CTTGCAAAGC 780 

10 

AAGGATGGTA CTGGGGACCA ATCACACGTT GGGAGGCAGA AGGGAAGCTA GCAAACGTGC 840 

CAGATGGTTC TTTTCTTGTT CGGGACAGTT CTGACGACCG TTACCTTTTA AGCTTGAGCT 900 

15 TTCGCTCCCA TGGTAAAACA CTTCACACTA GAATTGAGCA CTCAAATGGT AGGTTTAGCT 960 

TTTATGAACA GCCAGATGTG GAAAGGACAT ACTCCATAGT TGATCTAATT GAGCATTCCA 1020 

TCCAGGGACT CGAAAATGGA GCTTTTTGTT ATTCAAGGTC TCGGCTGCCT GGATCTGCAA 1080 

20 

CTTACCCCGT CAGACTGACC AACCCAGTGT CCCGGTTCAT GCAGGTGCGC TCGTTGCAGT 1140 

ACCTGTGTCG TTTTGTTATA CGTCAGTATA CCAGAATAGA CTTAATTCAG AAACTGCCTT 1200 

25 TGCCAAACAA AATGAAGGAT TATTTACAGG AGAAGCACTA CTGAAAGATT GAGAACCCTG 1260 

CATCTTGCAC TTTGGGAATA AGAACAAGAG ATTGAAATAC AGTTTACAAA CTTTCATTGC 1320 

CATCAAAATC TTTTGCTGCC ATAACTATTT CAGTTTTATG TGTAAAAGAG TCATCAGTTT 1380 

30 

GTTTAGGGGT GGGGAAGTGT CAGCAAGGTG TCTTGGGTTT ATTTTGGTTC TTTAAAAAAG 1440 

GGAAGTCTTG AAGTTTTAGA RGTGTTGAAT TATGTTTCAT CAATGTGCAG AATAATCACA 1500 

35 ATGTGAATTA TCAAATTCTC CTCAATGCCC CCCCCGCCCA KTCCTTTGCT GCTATCCACT 1560 

GTGATTTTTA TGCATTAAAA GCMCATTTCA TGTKTTTTCA ACCCTAAGTA AAGTTGAATG 1620 

AAACTTAACR GAATGGAAAT TGCTATTTCT TTTTAAATGG YCCATTTTCC AAAAMARGTG 1680 

40 

TTGAATAAMC AWMCCTGTKT GAATAAAACM MGRAWTTWMM WWTARCAMYG BAGRTGRGTT .. 1740 

TTTAATCTYY TAMYTTDAAA AGATTTATTT AGAATYGKKA ATTGACMTAA TATTGGGTWA 1800 

45 TBGGRMCGGR GATCTGSAAC ATATKYTTTA ACAACAWTTT WTTKKCYTTA ATKKDTTTYY 1860 

AARGKTGGBC TTATTWHTTT GGBKBBSVAA AGKWBVAHTT CTCYGTYSCC YTCGTTTTCA 1920 

TCTTCTAGTT TGTGNTATTT TAATAAATGG CCTTACATTA AAAAATTGTA AAGAAATGTA 1980 

50 

TACCACCAAT TTAGAAATTG TTGCCTTTTC TGTAATTAAA CTCGGGTACA AATNGGCATA 2040 

ACATGAAAAC CTATGGAACT AGAATTATTA TTAAAGAAAT ATTAGATGAT CAT 2093 
55 (2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1748 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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15 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

5 

ATGGAGGCCG GAGAGGAACC GCTGCTGCTG GCCGAACTCA AGCCCGGGCG CCCCCACCAG 60 

TTTGATTGGA AGTCCAGCTG TGAAACCTGG AGCGTGGCCT TCTCGCCAGA CGGTTCCTGG 120 

10 TTCGCCTGGT CTCAAGGACA CTGCGTGGTC AAGCTGGTCC CCTGGCCCTT AGAGGAACAG 180 

TTCATCCCTA AAGGATTCGA AGCCAAGAGC CGAAGCAGCA AGAATGACCC AAAAGGACGG 240 

GGCAGTCTGA AGGAGAAGAC GCTGGACTGT GGCCAGATTG TGTGGGGGCT GGCCTTCAGC 300 

CCATGGCCCT CTCCACCCAG CAGGAAACTC TGGGCACGTC ACCATCCCCA GGCGCCTGAT 360 

GTTTCTTGCC TGATCCTGGC CACAGGTCTC AACGATGGGC AGATCAAGAT TTGGGAGGTA 420 

20 CAGACAGGCC TCCTGCTTCT GAATCTTTCT GGCCACCAAG ACGTCGTGAG AGATCTGAGC 480 

TTCACGCCCA GCGGCAGTTT GATTTTGGTC TCTGCATCCC GGGATAAGAC ACTTCGAATT 540 

TGGGACCTGA ATAAGCACGG TAAGCAGATC CAGGTGTTAT CCGGCCATCT GCAGTGGGTT 600 

TACTGCTGCT CCATCTCCCC TGACTGTAGC ATGCTGTGCT CTGCAGCTGG GGAGAAGTCG 660 

GTCTTTCTGT GGAGCATGCG GTCCTACACA CTAATCCGGA AACTAGAAGG CCACCAAAGC 720 

30 AGTGTTGTCT CCTGTGATTT CTCTCCTGAT TCAGCCTTGC TTGTCACAGC TTCGTATGAC 780 

ACCAGTGTGA TTATGTGGGA CCCCTACACC GGCGAGAGGC TGAGGTCACT TCATCACACA 840 

CAGCTTGAAC CCACCATGGA TGACAGTGAC GTCCACATGA GCTCCCTGAG GTCCGTGTGC 900 

TTCTCACCTG AAGGCTTGTA TCTCGCTACG GTGGCAGATG ACAGRCTGCT CAGGATCTGG 960 

GCTCTGGAAC TGAAAGCTCC GGTTGCCTTT GCTCCGATGA CCAATGGTCT TTGCTGCACA 1020 

40 TTTTTYCCAC AYGGTGGAAT YATTGCCACA GGGACAAGAG ATGGCCACGT CCAGTTCTGG 1080 

ACAGCTCCTA GGGTCCTGTC CTCACTGAAG CACTTATGCC GGAAAGCCCT TCGAAGTTTC 1140 

CTAACAACTT ACCAAGTCCT AGCACTGCCA ATCCCCAAGA AAATGAAAGA GTTCCTCACA 1200 

TACAGGACTT TTTAAGCAAC ACCACATCTT GTGCTTCTTT GTAGCAGGGT AAATCGTCCT 1260 

GTCAAAGGGA GTTGCTGGAA TAATGGGCCA AACATCTGGT CTTGCATTGA AATAGCATTT 1320 

50 CTTTGGGATT GTGAATAGAA TGTAGCAAAA CCAGATTCCA GTGTACTAGT CATGGRTCTT 1380 

TCTCTCCCTG GGCATGTGGA AAGTCAGTCT TAGGAGGGAA GGAGATTCCA CTTGKCACGG 1440 

GCAACAGAGC CYTTACGTTT AAATTTTTCA GTCCAGTTAT KGAACAGCAA GTGTTTGAAM 1500 

TCTTTCTGGY TTGTTTTKGA WTTCAAAGTG GCAGTTACTG RWKGTTGTTT TTGGATTTAT 1560 

GGCAACYAAG TTAGGGCCTC CAGNGGTTNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 1620 

60 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNT HNABNVNRNN NRTNNNNRMA TNNNNNNNNN 1680 

NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 1740 

NNNNNNNN 1748 



35 



45 



55 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 2198 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: qDNA 



15 (xi) SEQUENCE DESCRIPTION: SI 

GGCGGTGGTG ATGGCGGCAG GCGCTCGGAC 
CGTCCAGAAA GTGCCCAGAA GAAACTTCCT 

20 

ACACTGGAAA TTGTAAAGAA TTTGTTTAAA 
GTACGGCCTA AAACAAGTCG GAGTCGAAGT 
25 GGAAAGAAGT TGTCTTGGTC CAAAAAGAGT 
GGGCAGCTTA GCTGTTCGTC CATTGAGTTG 
TTAGGCCGAT CCCTTAAACA GAAACTGCAA 

30 

AATTGTAGTG GCCGACACTC TCCAGGGCTT 
CTCATGTTAG ATAMGTGYSC YTTCCCACCT 
35 ATTAAACGAC ACACTGTTCC TATGAGTCCC 
TCTGAGAGGA AACTGAGAGA TGCTCAGCTG 
TGTTTCTCAC ATACCAATGG CCAGCCTTGT 

40 

GGTGGTCACA TAACTGGTTC TATGATGAAC 
GACATGGATT CAGAGGATGA AATTATAACG 
45 CCCAGGTGGG AAATGGAAGA GGAGATCCTG 
CAGATCGACT ACGTCCACTG CCTTGTTCCA 
TACTGGGGTG TCATGGACAA ATATGCAGCC 

50 

ACCTTTTTAC TTCGAGATTC AGCGCAGGAA 
TACAGTCGTT CTCTTCATGC TAGAATTGAG 
55 CATGATCCTT GTGTCTTCCA TTCTCCTGAT 
CCCAGTGCCT GTATGTTCTT TGAGCCGCTC 
TTTTCCTTGC AGCATATTTG CAGAACGGTT 

60 

GATGCCCTTC CCATTCCTTC GCCTATGAAA 
AAAGTTAGGT TACTCAGGAT TGATGTGCCA 



!Q ID NO:26: 

AGCTCCGCTT GAGCTGAGCT CGGAGAGATC 60 

CTTAGAAAAG CTGAAAACAC AATATTTATA 120 

ATGGCTGAAA ACAATAGTAA AAATGTAGAT 180 

GCTGACAGGA AGGATGGTTA TGTGTGGAGT 240 

GAGAGTTGTT CTGAATCTGA AGCCAAGAAA 300 

GACTTAGATC ATTCCTGTGG GCATAGATTT 360 

GATGCGGTGG GGCAGTGTTT TCCAATAAAG 420 

CCATCTAAAA GAAAGATTCA TATCAGTGAA 480 

CGCTCAGATT TAGCCTTTAG GTGGCATTTT 540 

AACTCAGATG AATGGGTGAG TGCAGACCTG 600 

AAACGAAGAA ACACAGAAGA TGACATACCC 660 

GTCATAACTG CCAACAGTGC TTCGTGTACA 720 

TTGGTCACAA ACAACAGCAT AGAAGACAGT 780 

CTGTGCACAA GCTCCAGAAA AAGGAATAAG 840 

CAGTTGGAGG CACCTCCTAA GTTCCACACC 900 

GACCTCCTTC AGATCAGTAA CAATCCGTGC 960 

GAAGCTCTGC TGGAAGGAAA GCCAGAGGGC 1020 

GATTATTTAT TCTCTGTTAG TTTTAGACGC 1080 

CAGTGGAATC ATAACTTTAG CTTTGATGCC 1140 

ATTACTGGGC TCCTGGAACA CTATAAGGAC 1200 

TTGTCCACTC CCTTAATCCG GACGTTCCCC 1260 

ATTTGTAATT GTACGACTTA CGATGGCATC 1320 

TTGTATCTGA AGGAATACCA TTATAAATCA 1380 

GAGCAGCAGT GATGCGGAGA GGTTAGAATG 1440 



10 
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TCKACCGGAG CTTTYGTTCC CTTTAGTGAG GGTTAATTTC GAGCTTGGCG TAATCATGGT 1500 

CATAGCTGTT TCCTGTGTGA AATYGTYATC CGCTCACAAT TCCACACAAC ATACGAGCCG 1560 

5 GAAGCATAAA GTGTAAAGCC TGGGGTGCCT AATGAGTGAG CTAACTCACA TTAATTGSGT 1620 

YGCGCTCACT GCCCGCTTTC CAGTCGGGAA ACCTGTCGTG CCASCTGCAT TAMTGAATCN 1680 

GCCAACKCGC NGGGANAGCG GTTNGCNTAT TGGGCGCTCT TCACTTCNTC GCTCACTGAN 1740 

TCNCTNCCTC GGTCNTTCGN TGCTGCTACN GTNTCCCCCA TCCAAGCGTT ATACGCTATC 1800 

CNCAGAACTG GGAAANNCNG AANACNNTNA CAAAGCTCAN TGCTANCGTA NACGCCNTGC 1860 

15 NGGCTTTTCC TCGTCCCCCN ACACNCTAAA CAGCCCTCGA GTGCAACCNC GATATANATN 1920 

TCTTCCCTNA ACCCCTGCCT CTGTCNCCGC CTNCGACTTC GCTTCCNNGG ATTGCTTTCN 1980 

CCCCGTAGTC NGTCNTAGTG NGCNGCGCCT TCCACCCTTC NACCNCTACG TANNNNNANN 2040 

CNCCAAANCC NCCNCCCCTC NGATAAAAAG TNAGNGCCTT NANNNCCNNG ATAAAAATGG 2100 

TCCCNTACTT TCCAATGTCT NCCNCCCGGC TNTTCTNGCC ACCCAANTNA NNTTTCCGGN 2160 

25 ACTGNATCCG GTGCTANCNT CCTGTTTCTC CTCCCNCC 2198 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 2254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 <ii) MOLECULE TYPE: cDNA 



20 



40 (xi) SEQUENCE DESCRIPTION: Si 

CTCGGGCCGG GATGGATCCG CCGGGAAGAG 
ACGGTGCCCC GCGCGTAGTG GGAGCTTACT 

45 

ATAAAGTGGG GAAAATGTGG AACAACTTAA 
AGGGAGGAAG CCGTAATGAG AACGTGGAGA 
50 AGAAAAGCAT CAGTCTGGGA GAGGCAGCTC 
ATGTTGCCTT ACAGCTGGGA CTGAGCCCTT 
GTGCCGCAGA GATCCCTCAA GTGGTTGAAA 

55 

CCACCCCAGG AACGAGGCTT GCACGGAGAG 
GAAAGAAGAA ACATTCCTGT TCCACAAAGA 
60 TTGGTAGAAC TCGAAGCGGC CTTCAGAGGC 
AGGACATGGA CAGCGTTTCT AGCCGCGCGG 
AGGACACGGT GGGTTTGTGT TTTCCCATGA 



IQ ID NO: 27: 

GAAGACAAGC GGAGCGTTGA GCCCCTGCGC 60 

CGCAGTAGCT CTCGCTCTTC TAATCAATGG 120 

AATACAGATG CCAGAATCTC TTCAGCCACG 180 

TGAACCCCAA CAGATGTCCG TCTGTCAAAG 240 

CCCAGCAAGA GAGCAGTCCC TTAAGAGAAA 300 

CCAAGACCTT TTCCAGGCGG AACCAAAACT 360 

TCAGCATCGA GAAAGACAGT GACTCGGGTG 420 

ACTCCTACTC GCGGCACGCC CCGTGGGGAG 480 

CCCAGAGTTC ATTGGATACC GAGAAAAAGT 540 

GAGAGCGGCG CTATGGAGTC AGCTCCATGC 600 

TCGGGAGCCG CTCCCTGAGG CAGAGGCTCC 660 

GAACTTACAG CAAGCAGTCA AAGCCACTCT 720 
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15 



25 



TTTCCAATAA AAGAAAAATM CATCTYTCTG AATTAATGCT KGAGAAATGC CCTTTTCCTG 780 

CTGGCTCRGA TTTAGCMCAA AAGTGGCATT TGATTAAACA GCATACAGCT CCTGTGAGCC 840 

5 

CACATTCAAC ATTTTTTGAT ACRTTTGATC CATCTTTGGT TTCTACAGAA GATGAAGAAG 900 

ATAGGCTTAG AGAGAGAAGG CGGCTTAGTA TTGAAGAAGG GGTTGATCCC CCTCCCAATG 960 

10 CACAAATACA TACATTTGAA GCTACTGCAC AGGTTAATCC ATTATTTAAA CTGGGACCAA 1020 

AATTAGCTCC TGGAATGACT GAAATAAGTG GGGACAGTTC TGCAATTCCA CAAGCTAATT 1080 

GTGACTCGGA AGAGGATACA ACCACCCTGT GTTTGCAGTC ACGGAGGCAG AAGCAGCGTC 1140 

AGATATCTGG AGACAGCCAT ACSCATGTTA GCAGACAGGG AGCTTGGAAA GTCCACACAC 1200 

AGATTGATTA CATACACTGC CTCGTGCCTG ATTTGCTTCA AATTACAGGG AATCCCTGTT 1260 

20 ACTGGGGAGT GATGGACCGT TATGAAGCAG AAGCCCTCTC CGAAGGGAAA CCKGAAGGCA 1320 

CGTTCTTGCT CAGGGACTCT GCACAGGAGG ACTACCTCTT CTCTGTGAGT TCCGCCGCTA 1380 

CAACAGGATC TCTGCACGCC CGGATCGAGC AGTGGAACCA CAACTTCAGC TTCGATGCCC 1440 

ATGACCCCTG CGTGTTTCAY TCCTCCACTG TCACGGGGCT TCTCGAACAC TATAAAGAYC 1500 

CCAGTTCKTG CATGTTTTTT GAACCGTTGC TAACGATATC ACTSAATAGR ACTTTCCCTT 1560 

30 TCAGCCTGCA GTATATCTGC CGCGCAGTGA TCTGCAGATG CACTACGTAT GATGGGATTG 1620 

ACGGGCTCCC GCTACCGTCG ATGTTACAGG ATTTTTTAAA AGAGTATCAT TATAAACAAA 1680 

AAGTTAGAGT TCGCTGGTTG GAACGAGAAC CAGTCAAGGC AAAGTAAACT CTCCGGTCCC 1740 

CAAAGGGTGT TAACTAGGTC CGCTTTCATG TGCATCAGAC AGTACACCTA TAGCAAGCAC 1800 

ACGTAGCAGT GTTAGGCTTT TTCATACAGT ATGTAAGCTT AGTGTTAGTA TCTGTCAGAT 1860 

40 GCTACCTGCT GTTACTTATT CAGATAAACA TGGTGCCTAT TGGAACAATA GCGGATAGAG 1920 

CTACAGGTGT TCAGTAAGAC TACAAAAACA TTTTGCCTAT TTCGCTAACA GTTTGGTTTT 1980 

TAATGGCTGT GGTATTTGAG TGAGGCAAYY CTGGGGCATT TGTTATGAAG AATTCTATTT 2040 

CTTACTGAAG AACAAATWAT TAATATTGGA TGAGTATTTC AACAGTGTGA CTAATGTTTG 2100 

AAATTATTTT TTCCTAAGAG TTTTTCCWAT AACCTTCCMA AAGTCGTGAT GTTTGTAGTT 2160 

50 ACCATAATCC AGCTTTGRAG TCCMAAARGA TTAAAGRCYG CCTCCCTTTG RAAAATGCCA 2220 

TTTCYKGCCC CAAGGCCTAG TGCCGTCCCT NCGG 2254 
(2) INFORMATION FOR SEQ ID NO: 28: 

55 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2206 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
60 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



35 



45 
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(xi) SEQUENCE DESCRIPTION: S! 
5 GGAGCGCGGC CTGGAGACTA ACAGCTGCTC 
AGGAGGAGGG GGCGGCCGGC TTCTGCTGCA 
CTTCCCGCTG CAGGACTTGG TCCCTCTGGG 

10 

GCAGCAGCAG CAACCTCCCC CGCCCCCGCC 
TCCTTCTCGG AAGGGCTCCT TCAAAATCCG 
15 CAACGGTGGC TCCGGCGGTG GGGATGGGAC 
TTCAGCTGCG AGCCTGACAG ACATGGGAGG 
GAAACCCAAG TTGACAAGAA CTCAAAGTGC 

20 

CACAGGTGAA ACTGTGTCGC TTGTGGATGT 
ACACCCTCCA ACTCCCCCTC CTCCTCCGAG 
25 TGGGACGCTG CCTACATCTG TCCTTGTGGC 
CCTACCTCCG CCTCCTCCAC CCCATGCCCC 
AGCAGCTGAA TCCCTGCACA GCCAACCCCC 

30 

TGACTCGAGC AGCTTTGCAG CCAGCCTTCG 
GCCAATGAAT TGGGAAGATG CAGAGATGAA 
35 GGTACGAGAC AGTTCTGATC CTCGTTACAT 
CACCCACCAC ACTAGAATGG AGCACTACAG 
GTTTGAGGAC CGCTGTCAAT CTGTTGTAGA 

40 

GAATGGAAAG TTTCTCTATT TCTTAAGATC 
CCAGCTGCTC TATCCAGTGT CCCGATTCAG 
45 ATTCCGGATA CGACAGCTCG TCAGGATAGA 
TCTGATCTCT TATATCCGAA AGTTCTACTA 
TCTAAAGGAA GCGCAGCTCA TTTCCAAACA 

50 

AGGGGCTCCC TGCTGGTCAC CACCAAGGGC 
CCAAATTAAG CTACCATGAA AAGAAGAGGA 
55 CTGTGCAGAG ACTTTGGTTC CCCACGCAGC 
CTCTGCGTGG GGCTCCACCT CACACCCACC 
TGGAAAACTG GAAGAAGTCT CAACACTGTT 

60 

CGGCCGCAAG CTTATTCCCT TTAGTGAGGG 
TACAACGTCG TGACTGGGAA AACCCTGGCG 



IQ ID NO: 28: 

GGAAGAGGAG CTCAGCAGCC CGGGTCGCGG 60 

GCCCCCAGGC CCTGAATTAC CTCCGGTGCC 120 

GCGCCTGAGT AGAGGGGAGC AGCAGCAGCA 180 

TCCTCCCGGG CCCCTCCGGC CACTCGCGGG 240 

CCTCAGTCGC CTCTTTCGCA CCAAGAGCTG 300 

CGGCAAGAGG CCTTCTGGAG AGCTGGCTGC 360 

CTCTGCGGGC CGGGAGCTGG ACGCGGGGAG 420 

CTTTTCTCCG GTCTCCTTCA GCCCCCTGTT 480 

GGACATTTCT CAGCGGGGCC TGACCTCTCC 540 

AAGAAGCCTC AGCCTCCTAG ATGATATCAG 600 

TCCGATGGGG TCTTCCTTGC AGTCTTTCCC 660 

AGATGCATTT CCCCGGATTG CTCCCATCCG 720 

ACAGCACCTC CAGTGTCCCC TCTACCGGCC 780 

AGAGTTGGAG AAGTGTGGTT GGTATTGGGG 840 

GCTGAAAGGG AAACCAGATG GTTCTTTCCT 900 

CCTGAGCCTC AGTTTCCGAT CACAGGGTAT 960 

AGGAACCTTC AGCCTGTGGT GTCATCCCAA 1020 

GTTTATTAAG AGAGCCATTA TGCACTCCAA 1080 

CAGGGTTCCA GGACTGCCAC CAACTCCTGT 1140 

CAATGTCAAA TCCCTCCAGC ACCTTTGCAG 1200 

TCACATCCCA GATCTCCCAC TGCCTAAACC 1260 

CTATGATCCT CAGGAAGAGG TATACCTGTC 1320 

GAAGCAAGAG GTGGAACCCT CCACGTAGCG 1380 

ATTTGGTTGC CAAGCTCCAG CTTTGAAGAA 1440 

AAAGTGAGGG AACAGGAAGG TTGGGATTCT 1500 

CCTGGGGCTT GGAAGAAGCA CATGACCGTA 1560 

CCTGGGCATC TTAGGACTGG AGGGGCTCCT 1620 

TCTTTTTCAA AAAAAAAAAA AAAAAAGATG 1680 

TTAATTTTAG CTTGGCACTG GCCGTCGTTT 1740 

TTACCCAACT TAATCGCCTT GCAGCACATC 1800 
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CCCCTTTCGC CAGCTGGCGT AATAGCGAAG AGGCCCGCAC CGATCGCCCT TCCCAACAGT I860 

TGCGCAGCCT GAATGGCGAA TGGGACGCGC CCTGTAGCGG CGCATTAACG CGCGGCGGGT 1920 

5 GTGGTGGTTA CGCGCAGCGT GACCGCTACA CTTGCCAGCG CCCTACGCCC GCTCCTTTCG 1980 

CTTTCTTCCC TTCCTTTCTC GCCACGTTCG CCGGCTTTCC CCGTCAACTC TAAATCGGGG 2040 

GCTCCCTTTA GGTTCCGATT TANTGCTTTA CGCACTCNAC CCCAAAACTT GATTAGGTGA 2100 

TGTCACTTAT GGCACNCCTG ATAACGTTTC CCCTTACTTT GATCACTTCT TTATATGATC 2160 

TTTCCAATGA AACATCACCT ACTCGTCATC TTTATTTAAA GATTTG 2206 
15 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1390 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



25 



30 



35 



40 



45 



50 



60 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 



CGGACGCGTG 


GGTTTGGCTG 


TGAATATTCT 


ATTTGCTTGC 


AGTATCTGTT 


TCTCTTCCTA 


60 


GGCTCAAGTT 


GGTGACCCAA 


GCCTATTGTA 


AACAAGTGAT 


TATCTCANNG 


GGAGATGCCA 


120 


ATGGAGTAAC 


AATTTGTTAA 


CCTTACGTTT 


TCTGTCTGTA 


TATTTTTTTA 


AAAATCTGGT 


180 


AGTTTCTGGA 


AAAAAAAGAG 


AAGGGGGTTT 


GTAGTACTTA 


ACCCTATTTA 


TTKSCRYRWG 


240 


TTTTAGTTAA 


TTAGTTTTTG 


GAATAAATGG 


ATTTCAGTAT 


AGCTTTGTGG 


TTAAATTGCA 


300 


TTGCCTTTAT 


TTTATGTTTA 


GGCTTATTTT 


TAAATTAACA 


TTTAACAGAA 


ACATTTGAAA 


360 


TAGAATTTGC 


ATGTCTGCCT 


TAATTAACTT 


AAAGACTGAT 


TTTAATCTGA 


CTATGACACT 


420 


GAGCATATTC 


TTTAAATTAC 


TCATAATTTA 


TAATGCTTAA 


TATAATCTTA 


ATTAAATTTA 


480 


GCAGTTTTAG 


TATAAGATGT 


GCCATTTTGT 


CCTCTGTATG 


TCTGAATGAA 


GCTATAACAT 


540 


TTGCCTTTTT 


ATTGCAGGTT 


TTCCTTTGGA 


ATATGGATAA 


ATACACCATG 


ATACGGAAAC 


600 


TAGAAGGACA 


TCACCATGAT 


GTGGTAGCTT 


GTGACTTTTC 


TCCTGATGGA 


GCATTACTGG 


660 


CTACTGCATC 


TTATGATACT 


CGAGTATATA 


TCTGGGATCC 


ACATAATGGA 


GACATTCTGA 


720 


TGGAATTTGG 


GCACCTGTTT 


CCCCCACCTA 


CTCCAATATT 


TGCTGGAGGA GCAAATGACC 


780 


GGTGGGTACG 


ATCTGTATCT 


TTTAGCCATG 


ATGGACTGCA 


TGTTGCAAGC 


CTTGCTGATG 


840 


ATAAAATGGT 


GAGGTTCTGG 


AGAATTGATG 


AGGATTATCC 


AGTGCAAGTT 


GCACCTTTGA 


900 


GCAATGGTCT 


TTGCTGTGCC 


TTCTCTACTG 


ATGGCAGTGT 


TTTAGCTGCT 


GGGACACATG 


960 


ACGGAAGTGT 


GTATTTTTGG 


GCCACTCCAC 


GGCAGGTCCC 


TAGCCTGCAA CATTTATGTC 


1020 


GCATGTCAAT 


CCGAAGAGTG 


ATGCCCACCC 


AAGAAGTTCA 


GGAGCTGCCG 


ATTCCTTCCA 


1080 
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AGCTTTTGGA GTTTCTCTCG TATCGTATTT AGAAGATTCT GCCTTCCCTA GTAGTAGGGA 1140 

CTGACAGAAT ACACTTAACA CAAACCTCAA GCTTTACTGA CTTCAATTAT CTGTTTTTAA 1200 

5 

AGACGTAGAA GATTTATTTA ATTTGATATG TTCTTGTACT GCATTTTGAT CAGTTGARGC 1260 

TTTTAAAATA TTATTTATAG ACAATAGAAG TATTTCTGAA CATATCAAAT ATAAATTTTT 1320 

10 TTAAAGATCT AACTGTGAAA AACATACATA CCTGTACATA TTTAGATATA AGCTGCTATA 1380 

TGTTGAATGG 1390 



